Jump to content

GrumpyOldPizza

Members
  • Content Count

    62
  • Joined

  • Last visited

  • Days Won

    4

Everything posted by GrumpyOldPizza

  1. Ouch. It says free shipping on orders over $150 !!! You could have saved the $1 for shipping ;-)
  2. Shipping within the United States FedEx Ground=$1.00 0000 0001 FedEx Saver (3-day delivery)=$2.00 0000 0010 FedEx Express Economy (2-day delivery)=$4.00 0000 0100 FedEx Overnight PM Delivery=$8.00 0000 1000 FedEx Overnight AM Delivery=$16.00 0001 0000 Shipping outside the United States International Economy=$4.00 0000 0100 International Priority=$16.00 0001 0000 Just saying ...
  3. GrumpyOldPizza

    Have feedback for TI? Please share here.

    Thanx for sharing.Interesting presentation. Though I don't buy your analysis. The peripherals on MSP432 are not any simpler than say on STM32L4 or NXP1549. Actually, what's funny is that the talk brings up one of the key weaknesses of MSP432 (and the idea of simply taking crufty MSP430 peripherals). ANY other Cortex-M3/M4 controller I know has a way that I can set a group of GPIO pins on a port atomically without affecting other pins. TM4C uses some address bits to generate an update mask. STM32 has a separate SET/RESET mask, LPC chips have a set of masks that can be used (3 on LPC1549). MSP432 has only a "PxOUT". So no atomic updates unless you get creative with bit-band addressing or use LDREX/STREX/CLREX type atomic primitives. If you are at that level, complexity would not be something that concerns you at all. I do get the argument, that if you come from MSP430 it's a step up. But it's a massive step down from any other ARM Cortex-M3/M4 controller other than the very first ones from Luminary Micro in 2005, like LM3S811. EDIT: It seems it's now possible to actually order the XMS432P401RIPZR variant for $6.781 (256k flash, 64k ram, no USB). The XMS means it's still not a production part. Don't know how that compares to the similar ST part (STM32L433RC), which is just hitting the distributors as well.
  4. GrumpyOldPizza

    Have feedback for TI? Please share here.

    The goal of the Arduino API is precisely to be usable for the dumbest possible user so that they can get simple things going quickly and then tinker around with the more complex stuff. While it is trivially possible to do more complex stuff (like a flight controller), it is not the goal of the Arduino/Wiring mindset. Not learning about the underlying hardware is often what I want. There is no good reason to know about all the details of an I2C peripheral on the 23rd different ARM controller. All I really want is to transmit/receive bytes.
  5. GrumpyOldPizza

    Have feedback for TI? Please share here.

    What you are suggesting is more higher level than driverLib, which IMHO is a macro wrapper layer around the device registers. The problem with such an approach is that coming up with a good abstraction is very hard. There is no real good one-size-fits-all. A good example is ST's HAL (CubeL4 for example). It assumes some higher level locking that a RTOS could supply. The problem is that if you want to use part of the code in both the ISR and the Task/Threads, this will fall apart, and ... well ... you need something different. The Arduino abstraction is nice, but falls short on asynchronous operations. The best I have seen up to now (but admittedly not used) is Nordic Semi's SDK for nRF5, which splits up things into a HAL and into a driver layer.
  6. GrumpyOldPizza

    Have feedback for TI? Please share here.

    Well, there I have to play devils advocate. Isn't part of the value proposition the driverLib ? With the explicit goal of abstracting the peripherals enough so that it should not matter ? Also the CPU cannot be hidden that well by the compiler (but to a large extend). Things like NVIC are really, really different. So at some level there is a massive delta that one could take advantage of.
  7. GrumpyOldPizza

    Have feedback for TI? Please share here.

    Good question. For hobby projects it's probably a non-issue. But if you want to be aggressive with power savings it is an issue. Typically i'd always attach a DMA channel to any I2C RX, and any UART RX/TX, as well as SPI RX/TX (for sensor reading up to 4MHz). For my hobby use (Autonomous Rovers), that would be 5 to 7. So, still within 8 channels, but not a lot of spares for software triggered DMA. Perhaps it's just that 16 feels like a more appropriate number (32 definitively too much). Couple of use cases to illustrate: (1) UART, GPS input. I'd use a ping-pong buffer scheme on RX with a receive timeout. Each of the buffers 16 bytes. This way I get an interrupt only every 16 bytes, or when a sequence of reports is done. At 115200 baud this brings the interrupt rate from about 10000 down to less than 800. (2) I2C sensor reading. DMA on RX (with TI also on TX). That means I take 2 interrupts for a typical "send an index, and then read n bytes of data from a sensor". If this is 16 bytes of sensor data, then without DMA you take 19 interrupts. (3) TFT on SPI. Here a double buffer scheme is nice. In one buffer you generate data for the new scanline you are working on, while using DMA on the other buffer to send over data/commands for the previous scanline that had been already generate. One can nicely overlap CPU and SPI. Of course that is not beneficial for all operations. (4) SD on SPI. If you send more than one 512 byte block and want to use CRC16, then you can let the CPU compute this CRC16 on the next block you are about to send, while DMA takes care of sending the current block without CPU interaction ... So a lot of uses, at least for me, mainly centered around communication.
  8. GrumpyOldPizza

    Have feedback for TI? Please share here.

    You are mixing up my statements. I said that a M4 is more efficient than a M0. Somebody asked about M4 vs. M0, and I tried to give an answer to that. It has nothing to do with a 430 CPU. That one is different, uses a different internal bus (not AHB/APB and crossbar), does not have nested interrupt vector and so on. It's a different class. It has substantial less CPU horsepower, but if you get away with it, it might consume less power. And yes, with sleeping you are right. It's one piece of the puzzle, but it's in a lot of cases the 75% piece. DMA is another one, clock gating ... - Thomas
  9. GrumpyOldPizza

    Have feedback for TI? Please share here.

    Actually if you design your software stack the right way, it does mean exactly that (some caveats apply though). If you need to do some CPU work in response to an external stimulus, then a M4 will do that work faster at the same clock than a M0/M0+. Consequently you can put the CPU to sleep earlier for a longer amount of time. This longer sleep time is what saves you most of the power. The second main trick to save power is to use DMA for IO transfers, so that you can avoid waking up the CPU from a sleep mode as much as possible (there is a nice application note available from ST, which analyses where clock frequencies are the sweet spot for a given voltage range, assuming that you have to spend a fixed amount of CPU clock cycles; their result was that alway the upper limit for the voltage range was the sweet spot ...). So yes the M4F core in the MSP432 make it more efficient than the MSP430 core, which means in theory it should consume less power (yes, AHB and the efficiency of the sleep modes will affect that as well). But MSP432 is missing or hampering the second cruical part, which is adding peripherals that can offload the CPU to do things like batch acquisition (or sleep-walking as Atmel calls it). It's also quite telling that the current leader in ULPbench, Ambiq Micro also chose a M4 over a M0 (http://www.extremetech.com/computing/198285-new-microprocessor-claims-10x-energy-improvement-thanks-to-subthreshold-voltage-operation this article contains some of their rationale). Back to MSP432. With it's peripherals, one could argue that this is not the same target market as say a STM32L4 as you need fewer peripherals, less horsepower, and such. But that then raises the question, why upgrade the CPU core from MSP430 at all ? - Thomas
  10. GrumpyOldPizza

    Have feedback for TI? Please share here.

    The Cortex-M4 only is not a big grief for me, probably the other way around. I like the FPU over the M3. The M0 is a nightmare, as a lot of the good debugging features that M3/M4 have get deleted on M0. Even fundamental things like the DWT_CYCCNT ... As far as I understand the M4 is also more power-efficient than M0, because it can get work done faster, after which you can put it to sleep longer. The price different, I really don't know. It's my hobby, and a few bucks either way will not hurt me. The MSP430 peripherals, I simply cannot understand. The older Stellaris parts had been designed with HW FIFOs in place, so you could use UART/SPI without DMA. Of course TM4C now has a 32 channel DMA controller, where you can background a lot of the IO handling. And along comes MSP432, which does away with most of the usable HW FIFOs, and just so that the software cannot compensate, reduced the number of DMA channels to 8. If you have one UART (RX DMA), one SPI (RX/TX DMA) and one I2C (RX DMA), and half of your DMA channels are gone ... Just looking at the feature set (not the crummy HW implementation), STM32L0 & STM32L4 seem to be the better choice. Not that I want promote somebody else's product here, it's just that I don't understand where MSP432 fits there (besides the fact that you still cannot buy the chips). So in a nutshell, please TI, focus on TM4C, bring some new parts and launchpads, perhaps a Cortex-M0+ if that saves costs. - Thomas - Thomas
  11. GrumpyOldPizza

    Have feedback for TI? Please share here.

    Here my 2 cents (after having used a bunch of the parts going back to the Luminary Micro days). - get rid of the MSP432. It's a mess compared to other Cortex-M4 parts (yes, even low power ones). If you want a good low power part, please use the same peripherals as on TM4C, so that code can be reused. - a TM4C129 Launchpad with the dimensions of the TM4C123 launchpad - add CMSIS-DAP to the ARM Cortex based launchpads. Nobody really wants to see yet another vendor specific protocol like the LMICDI (not that gdb remote serial was not a nice idea, it's just it did not pan out). Not sure what else to say. I fundamentally think TI took the wrong turn with MSP432, which it seems to have left to die a lonely death. Perhaps folks coming from MSP430 see that different, but it's a massive turnoff for everybody coming from more grown up microcontroller. - Thomas
  12. GrumpyOldPizza

    TM4C and bootloaders

    Posting your linker script would help. The last time I ran into this problem was when the reset vector would not have had the LSB set to 1.
  13. GrumpyOldPizza

    Where's the MSP432 going?

    Just my 2 cents here. MSP432 does not seem to go anywhere, because there is STM32L476. Both devices are fairly similar in power consumption and power saving modes. But MSP432 inherited it's peripherals from MSP430, which just feel very outdated compared to STM32L476 (actually also a massive step back from TM4C). And there is just starts. The TI product goes only up to 48MHz wth 256kB flash, the ST product can go up to 80MHz with 1024kB flash. The ST product has dedicated SAI support was well as a PDM decoder for MEMS microphones ... Ah, and the ST product has a CODE/DATA cache (mislabled IMHO as ART Accelerator ;-)). The only thing that I can see the TI product having going for it is the 14bit SAR ADC. I simply suspect that either TI went back to the drawing board, or gave up. Ah, and there is this issue. A TM4C123GH6PM costs $6.23 in units of 1000. A similar STM32F401 is to be had for $3.54. That implies that the TI product is probably price wise nowhere near the competition.
  14. GrumpyOldPizza

    LaunchPad Flight Controller

    Thanx for posting this. Lot's of inspiration. Couple of questions. The ESCs and the protocol used via PWM does not seem to be the normal RC servo pulse protocol. Is there any documentation for the modified variant ? Did you run accross some simpler code for handling a HC-06 on Android (haven't programmed too much in Java, so a simple starter would be nice) ? - Thomas
  15. GrumpyOldPizza

    Autonomous Rover

    Yes, the LCD UI was tricky for more than one reason, but also the most rewarding. Last year we had a screwup because the display did not convey enough information. So this year I asked the two race engineers what *should* be displayed, why, how ... So we went throu a whole list of scenarios of how you could diagnose hardware malfunctioning, or software issues (like how do I know the RPM sensor is working ?) It came down to the level of "I cannot see yellow in bright sunlight". Given that the HW setup was as simple as I could get it, there were a lot of ideas my kiddos could contribute (like placement of components, wireing, io port assignment). While that sounds perhaps simple for us adults (and perhaps us engineers), it's something else for a 10 year old and a 12 year old. - Thomas
  16. GrumpyOldPizza

    Autonomous Rover

    After a half way successful entry in the AVC 2014 competition I finally have the time to write a few words. The rover is build upon a rather cheap RC car, that however is a very reasonable platform for the project, the Turnigy 1/16 Monster Beetle. The core computing platform is a Stellaris Launchpad with RobG's excellent LCD booster pack, which doubles for me as LCD and microSD unit. There is a GY-87 IMU, which consists of a MPU6050 Accel/Gyro and a HMC5883L Magnetometer, plus a BMP180 Bario (which I did not use, regrettably). The SW on the LM4F120 is able to read the MPU6050 at 1Khz and the HMC5883L at 75Hz (which is more than 50% of the I2C bus bandwidth). There is also a uBlox6 based GPS uint, the VK16U6 (in case you are wondering). It' operates at 4Hz update rate. Why 4Hz ? The ublox6 has an advertised max update rate of 5Hz. However at 5Hz SBAS/WAAS is not working, at least for me, and not with 12 satellites in perfect view, plus either PRN133 or PRN137 (if I recall correctly). Going back to 4Hz solved that ;-) There is a 13 state EKF on board that is run at 250Hz (in the pre-race configuation). The path follower is a simple "Pure Pursuit", with position input, and using the magnetometer for heading estimation. The steering is done via rather simple poportional control derived from the steering error, rather than a PID ... mainly because I ran out of time programming/tuning the PID, and because it turned out to be good enough. The vehicle was tested at 20Mph speed and finetuned for that upper boundary. The contollering mechanism did not try to adjust the speed other than an initial gradual ramp up to top speed, and the breaking at the end. Experiments showed that flipping the vehicle in gradual turns was not an issue, and that the slippage compensated for the high speed good enough. In the AVC 2014 race config we dialed the speed back to 12Mph and 15Mph respectively. We did one run using the EKF, and the subsequent runs using GPS only (with forward predicition). The downside was a mechanical failure which causes us to fail on one of the 3 rounds. He an image as to what the underlying issue was: The vehicle jumps, but still tries to navigate in mid-air. In one of those jumps the front left C-Bracket got nuked, as the front wheels were pointing all the way left ... Perhaps next year, at AVC 2015 there are more entrants using a TI Tiva ;-) - Thomas
  17. GrumpyOldPizza

    Autonomous Rover

    Haven't posted in a while ... So there was AVC 2015. Less successfull for us (with 2 rovers this year). I was relegated to be the SW guy, while my kids actually build and run the rovers. Anyway, I thought it might be interesting to post a link to the source code that was used (which of course is utterly outdated, probably ;-)). https://github.com/BizzaBoy/AVC2015-KK There are a couple of interesting pieces that might be of use outside the autonomous rover domain. First off the concept (besides being as cheap as possible) was to take an R/C car, a TI Launchpad, RobG's TFT/SD boosterpack, hook up a GPS, a MPU-9150, a RPM sensor, a 3 channel R/C reveiver, and 2 buttons. The TFT is displaying all status information, which is pretty handy before starting the rover, so one can see whether the GPS is actually working ;-) Here some of the pieces that might be of interest: - the MPU-9150 is samples at 1kHz triggered by the INT output and properly timestamped relative to a wallclock; the builtin AK8975 is sampled at 100Hz; i2c is interrupt driven - the GPS code supports NMEA as well as UBLOX binary; support for GPS+GLNONASS is there; MTK3333, MTK3339, UBLOX6/7/8 are supported; full initializitation at runtime so that this can be used without backup battery or external flash that would store the configuration; ah, and there is proper timestamping via the PPS input - of course there is full speed logging to a microSDHC card ... this time DMA driven to free up more processor cycles - stack checking via the MPU; handy to detect stack overflows (yes, saved my backon ;-)) - there is a profiling system in place that buckets cycles spend on various logical tasks (like display, record, navigation ....); very handy to find out how much processor power is still left - lots of interesting code; stared to play with atomics and bitband accesses - the whole system is bare metal, CMSIS based; so if one looks for a CMSIS setup for TM4C123, that might be a good starting point - no RTOS in use; things are either interrupt driven, or timer driver (systick callbacks), or via the PendSV exception as kind of a deferred interrupt; did this, so I could half way explain the system to my son who was running one of the rovers ;-)) Here a few pictures, and a link to my son's entry - Thomas
  18. GrumpyOldPizza

    TI has two new Hercules LaunchPads

    The problem with that approach in general is that you cannot re-enable interrupts early. You have to either write a generic ISR shell that masks (and unmasks on return) the proper channels. This is 3 DWORDs you have to write. You also have to read based upon the channel number the mask you want to apply. Or you could generate proper ISR shell for each channel and use hardcoded values to mask & unmask. So you are essentially introducing a long latency before you get to the very first useful instruction of your ISR. But it's actually worse. Suppose you want to allow a ISR handler to enable/disable channels ? Then a hardcoded unmask will not work. You actually need to keep a softcopy of what is supposed to be enabled and restore that anded with the bits that you want to re-enable. All I am saying is that it's a big pain, compared to NVIC or GIC. I always felt that exactly this, the better interrupt handling was a major plus for Cortex-R/Cortex-M over Cortex-A. - Thomas
  19. GrumpyOldPizza

    TI has two new Hercules LaunchPads

    Anthony, mind checking the TRM for me there ? I might have misunderstood them VIM documentation ... but the CHANCTRL[0:23] registers imply that you can map exactly one interrupt per channel. You cannot have multiple interrupt sources for one single channel. So you cannot really group, you just can establish an linear priority sequence. - Thomas
  20. GrumpyOldPizza

    Stellaris/Tiva MFLOPS/MHz?

    This is all tricky, and to be honest there are somethings I have not understood (especially how VFMA is implemented with 3 cycles + 1 cycle latency as opposed to 1 cycle + 3 cycles latency). VFMA is a MAC, but it takes 3 cycles. Why ? If VMUL and VDD take 1 cycle each, what do the various MAC variants help ? Anyway lets' say you do matrix operations (which are typically MAC operations). Let's say you multiply a 4 element vector by a 4x4 matrix. Then you have 20+2 loads, 4+1 stores, 16 multiplications, and 12 additions. This is 55 operations, hence 55 cycles, whereby you crammed in 28 floating point operations. Thus about 0.50 Mflops/MHz. (The +2 & +1 is the overhead of the VLDM/VSTM where no data gets transferred). The data is from the Cortex-M4 TRM, section 7.2. It also points out a latency of 1 cycle. Back to the example above. Say you want to multiply an array of 4 element vectors by a 4x4 matrix, and the matrix is preloaded, then you'd spend 4+1 loads, 4+1 stores, 16 multiplies and 12 adds. 38 cycles to do 28 cycles math, or 0.74 Mflops/MHz.
  21. GrumpyOldPizza

    TI has two new Hercules LaunchPads

    I am rather familar with ARMv7-AR (as opposed to ARMv6-M and ARMv7-M and ARM4vt [what a mess ;-)]). I would argue that the code you'd need to implement priority groups in software is rather identical for all non-M profiles. Just looking at other Cortex-R4/R5/R7 implementations, like the Spansion FCR4, the all have interrupt controllers that have maskable priorities. Again, I am just negatively surprised. Here is what I usually do, where I'd need nice grouping. Say I have a bunch of I2C devices and some of them use time triggered reads, and some signal DRDY via a separate interrupt line to the MCU. If I put all of the interrupt sources for the I2C devices at the same priority level, then I can guarantee exclusive I2C access (say to my I2C transaction queue). If the interrupts are on different levels, I need to make sure thou other means that I2C transactions get queued properly, as a higher priority ISR could corrupt the transaction queue ... While it is not that outlandish tricky to get that done, the interrupt priority solution is easier and safer from a software point of view. - Thomas
  22. GrumpyOldPizza

    TI has two new Hercules LaunchPads

    The 2 cores are in lockstep. I'd assume that the interrupts would get broadcasted properly. Doing this by hand as you suggested adds a huge overhead to the ISR shell. I did that ages ago for a ARMV4t device with a VIC, and there the cost was about 20 clock cycles on entry and an exit. That device had only 32 IRQs, hence less registers to touch.
  23. GrumpyOldPizza

    TI has two new Hercules LaunchPads

    I am tempted to use the LAUNCHXL2-RM46 for some rover tinkering. It had a lot of FLASH/RAM, and double precision floating point. But it has this utterly useless VIM as interrupt controller. Why couldn't they just use a GIC with interupt priority levels and priority masking ...
  24. GrumpyOldPizza

    Stellaris/Tiva MFLOPS/MHz?

    Good question. VADD/VMUL take 1 clock cycle each (to issue). Then there is VFMA (fused multiply add) which takes 3 clock cycles, which implies a latency of 1 clock for the multiply. So I'd say you have 1 MFLOP/MHz, assuming perfect code. That however does not take into consideration that you have to load the operands and then store the results again. - Thomas
×