Jump to content
43oh

Convert char to integer issue


Recommended Posts

Though I think encoding comes into play whenever you do anything with text. So displaying a line of text on a screen implies that the "screen" can decode whichever encoding you're sending to it. There is no possible way to have anything like unencoded text on a computer, it is either encoded as ASCIIZ, length+ASCII, NUL-terminated UTF-8, etc.

That depends on the level you're working at. Yes, at the bottom it's all bytes (or bits, or transistor states, or whatever). In C, you're nominally limited to text that can be expressed as ASCII characters, and the side comment that led us down this rabbit hole was prompted by the OP's confusion between text strings and sequences of characters and NUL-terminated sequences of characters: three distinct concepts.

 

I highly approve of the impromptu college-level lecture on information representation in a thread otherwise begun by a newbie at C :-D

Even if it's not necessary to completely understand a specific subtlety at a particular stage of development, I do think it's worth a hic sunt dracones (i.e., "by the way, you're making an assumption here that won't always work out for you").

 

Back to the lecture.

 

Unlike C, in Python unicode strings are a first-class data type. You can operate on them (calculate length, extract substrings, sort, catenate) with complete disregard for how the text is represented as a sequence of characters, and how each character is represented. Similarly you can do this with C++11 with wide character support. In practice, these systems generally use UCS-16 or UCS-32 underneath, but to the developer it's just text of some arbitrary language.

 

In these environments, you can certainly manipulate and display ??????? without caring how that string is encoded in memory or by the I/O subsystem (which might translate for you from the internal representation to the encoding specified by the environment, e.g. the LANG variable or a previous invocation of setlocale(3)). In an embedded environment, you are more likely to need to know that the display you're writing to requires a specific byte to represent a specific character (extended ASCII), or that you must do the translation from characters to glyph bitmaps yourself.

 

So I understand that in earlier versions of PyBX you make the (implied?) decision that your encoding was ASCIIZ, while in fact this was a bypass to use text that was actually UTF-8, am I correct?

Then what did you mean when you said you were storing text in xmlt type, while data is stored in xmld was encoded? Because as far as I see, both types would in fact be UTF-8 encoded text.

Here's an example Python program to play with. It works in both Python 2 (2.6+?) and Python 3.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals;
import binascii

te = 'text'
de = te.encode('utf-8')

print(type(te))
print(type(de))

print(te)
print(binascii.hexlify(te))
print(de)
print(binascii.hexlify(de))
print(te == de)

tr = '???????'
dr = tr.encode('utf-8')
print(tr)
print(binascii.hexlify(tr))
print(dr)
print(binascii.hexlify(dr))
print(tr == dr)
In Python 2, the t* values have type unicode and the d* values have type str. (Without the "from __future__" line the t* values would also have type str because Python 2 failed to distinguish sequences of (ASCII) characters from sequences of bytes.)

 

In Python 3, the t* values have type str and the d* values have type bytes.

 

The output from this under Python 2 is:

llc[49]$ python /tmp/x.py
<type 'unicode'>
<type 'str'>
text
74657874
text
74657874
True
???????
Traceback (most recent call last):
  File "/tmp/x.py", line 20, in <module>
    print(binascii.hexlify(tr))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128)
In the first block, you see that the text and encoded versions have the same bit representation and compare equal, even though they have different types.

 

The second block blows chunks, because binascii.hexlify() can't operate on strings that have non-ASCII characters. If you comment out that line, you get a warning and the two strings are not equal.

 

Under Python 3.4.2 you have to comment out the hex conversion of the text versions because it doesn't know how to convert Unicode to bytes, but doing that you get:

llc[50]$ /usr/local/python-3.4.2/bin/python /tmp/x.py
<class 'str'>
<class 'bytes'>
text
b'text'
b'74657874'
False
???????
b'\xd1\x8d\xd0\xbd\xd0\xb5\xd1\x80\xd0\xb3\xd0\xb8\xd1\x8f'
b'd18dd0bdd0b5d180d0b3d0b8d18f'
False
Even for the strings that are entirely ASCII, the text and the UTF-8 encoded text are not equal, because Python 3 distinguishes them by type.

 

This is why, in unit tests where I was checking whether the XML was right, it was necessary to know whether the XML was text or had been encoded in some way, e.g. for storage on disk. This arose in part because Python's standard library for converting Document Object Model (DOM) representations of XML into "XML" produces encoded text ready to be transferred to another system, not Unicode strings suitable for use in the application.

Link to post
Share on other sites

Hello friends

 

I see that this thread has spiraled out of what I could ever learn by the stretch of the imagination :)

 

But I still have the problems when I convert char to integer.

 

This part of the code works as intended and the original problem is solved by increasing the arraysize, but now the MCU crashes and reboots on a remote part of the code not even touching this side of the code!

 

This is the actual code

            // Put on onftime hours/minutes in a array + terminating NULL byte.
            char onfTimerH[3] = {data[9], data[10]};
            char onfTimerM[3] = {data[12], data[13]};
            
            // Convert onftime hours/minutes to a integer.
            int i_onfTimerH = (int) atoi(onfTimerH);
            int i_onfTimerM = (int) atoi(onfTimerM);

            i_onfTimerMin = (i_onfTimerH*60) + i_onfTimerM;  // Calculate onftime in minutes.

As soon as I remove one of the char to int conversionlines like below everything is working!

            // Put on onftime hours/minutes in a array + terminating NULL byte.
            char onfTimerH[3] = {data[9], data[10]};
            char onfTimerM[3] = {data[12], data[13]};
            
            // Convert onftime hours/minutes to a integer.
            int i_onfTimerH = (int) atoi(onfTimerH);
            //int i_onfTimerM = (int) atoi(onfTimerM);

            i_onfTimerMin = (i_onfTimerH*60);  // Calculate onftime in minutes.

What am I doing wrong?

 

BTW the reason that I need two char-to-int lines is couse I need to convert hours and minutes to total minutes.

 

For example the data could hold:

data[9] and data[10] is the hours, lets say 09 hours

data[12] and data[13] is the minutes, lets say 99 minutes

 

At the moment I convert both chars to int like the above code and then add them ((i_onfTimerH*60) + i_onfTimerM), but if someone can tell me a better way of converting theese chars to total minutes it removes the need for the two char-to-int lines. Please note that the hours and minutes both can range from 01-99.

 

Looking forward to any replies.

 

Best regards

Andreas

Link to post
Share on other sites

 

            // Put on onftime hours/minutes in a array + terminating NULL byte.
            char onfTimerH[3] = {data[9], data[10]};
            char onfTimerM[3] = {data[12], data[13]};

 

Have you confirmed that these arrays have the ASCIIZ values that you expect, by displaying them?

 

In C, initializers elements like this must be computable at load time. GCC allows what you're doing as an extension, but it may not be working correctly.

 

I suggest it's worth your time debugging this approach so you understand why it doesn't work. Then replace it with this solution (which you will first test, since I haven't done so):

 

  int min = data[10] - '0' + 10 * (data[9] - '0');
  int hr = data[13] - '0' + 10 * (data[12] - '0');
  return min + 60 * hr;
Also, if you are using gcc, compile with flags -Wall -Werror to detect various problems. You might add -ansi -pedantic.
Link to post
Share on other sites
But I still have the problems when I convert char to integer.

This part of the code works as intended and the original problem is solved by increasing the arraysize, but now the MCU crashes and reboots on a remote part of the code not even touching this side of the code!

As soon as I remove one of the char to int conversionlines like below everything is working!

I suspect you might have run into memory allocation issues. If you have lots of nested function calls, lots of global variables or have allocated a large heap/free space chunk, your memory regions may start to overlap.

Note that for the MSP430G2553, your RAM region is only 512 bytes, or 256 native integer sizes. So every byte counts, so to say. The line you commented out does also allocate an integer on the stack.

Link to post
Share on other sites

Hello again

 

Sorry for the delay in replying. I truly appreciate the help you guys give :)

 

I have tried a bunch of things and I got it working now but I kept the original code and change the other part of the code a little that was giving me crashes. I download Ti's CCS and tried to debug with it but I didn't work it out how to do it. It did give me about 50 compile warnings (nothing major) that Energia didn't say anything about though. They are all fixed now.

 

The code now works as intended and I have really tested it hard without any issues but it would be interesting to know what coused the crash since if I change some part of the code that was by itself working but caused errors on a remote random part of the code made me also suspect some kind of lack of memory issue maybe.

 

Since I am a noob please bear with me. I always thought that Energia or whatever compiler/uploader would tell me if I have reached some kind of space/RAM limit in MCU. But that isn't true?

 

I have about 1500 lines of code.

The globals vars are as follows:

7 int

3 long

About 250 chars

 

And a hell of alot of local vars. Also I use the MspFlash.h to write some data to the flash memory.

 

How can I calculate if I need a bigger MCU?

 

Best regards

Andreas

Link to post
Share on other sites

In CCS and the like, you can configure your compiler to output a static usage map. This means the space stated there is already in use before entering your main/setup function.

In addition to that, every nested function call adds 4 bytes plus the size if the function scope variables to that. When I say nested, it means, when function A calls function B, which calls function C, and after that function A calls function D, then the deepest stack would be 3*4+variables in A,B and C, or it would be 2*4+variables in A and D, depending on which is larger.

 

In your case, your static RAM usage would be 7 * sizeof(int) + 3 * sizeof(long) + 250 * sizeof(char).

sizeof(char) is 1, sizeof(int) is I think 2, but could be 4, sizeof(long) is probably 4.

So your RAM usage is already 14+12+250=278 bytes of the 512 bytes available to you. Which leaves only 234 bytes for all other functioncalls and stack variables.

Then the MspFlash object takes 10 bytes I think, and the call to MspFlash.write takes another 6 bytes (stack), so you're down to 218 bytes.

Depending on the rest of your application, this is plenty to work with. Do you use malloc, new or anything the like in your application?

Link to post
Share on other sites

In CCS and the like, you can configure your compiler to output a static usage map. This means the space stated there is already in use before entering your main/setup function.

In addition to that, every nested function call adds 4 bytes plus the size if the function scope variables to that. When I say nested, it means, when function A calls function B, which calls function C, and after that function A calls function D, then the deepest stack would be 3*4+variables in A,B and C, or it would be 2*4+variables in A and D, depending on which is larger.

 

In your case, your static RAM usage would be 7 * sizeof(int) + 3 * sizeof(long) + 250 * sizeof(char).

sizeof(char) is 1, sizeof(int) is I think 2, but could be 4, sizeof(long) is probably 4.

So your RAM usage is already 14+12+250=278 bytes of the 512 bytes available to you. Which leaves only 234 bytes for all other functioncalls and stack variables.

Then the MspFlash object takes 10 bytes I think, and the call to MspFlash.write takes another 6 bytes (stack), so you're down to 218 bytes.

Depending on the rest of your application, this is plenty to work with. Do you use malloc, new or anything the like in your application?

 

Thanks for your reply!

 

I looked at the map file and I was overwhelmed with data, didn't get much of that :)

 

I have quite a lot of nested function calls but not deeper than function A calling function B.

 

Do I understand it right that all the global vars is allocated in the RAM from the beginning and the local vars is allocated when calling the function and reset when the function exits? So the remaing 218 bytes you estimated needs to be allocated by one function bigger that 218B to run out of RAM?

 

I don't use any malloc or new. The most of the code is just very basic code, IF statements, loops etc.

 

I did look at this thread:

http://forum.43oh.com/topic/3682-flash-and-estimated-ram-usage/

 

This gave me the following:

   text    data     bss     dec     hex
  12842       4     448   13294    33ee

So this would mean I am using 452B (data+bss) out of 512B RAM? Feels dangerously close :/ How accurate is this?

 

Kind regards

Andreas

Link to post
Share on other sites

It is exactly the right value, no estimate. So somewhere I overlooked 170 bytes :blush: On the other hand, you have 60 bytes left to spare, which is tight, but doable. As I told already, each function call deeper takes another 4 bytes plus the size of the function scope variables (including function arguments).

I think the bootstrap takes some stack, then main(), then (assuming Energia sketch) setup(), or loop() or serialEvent() (the latter is obfuscated by the sketch framework).

So you're 4 calls deep a fair amount of time, roughly 20 bytes. So unless you have a bunch of local variables in your loop() you don't really have to worry I think.

Link to post
Share on other sites

This gave me the following:

   text    data     bss     dec     hex
  12842       4     448   13294    33ee
So this would mean I am using 452B (data+bss) out of 512B RAM? Feels dangerously close :/ How accurate is this?

 

It's exactly correct for statically allocated memory. Both heap (malloc) and stack (local function variables aka auto) increase the total memory required.

 

Stack starts at the top of RAM and goes down toward the top of bss. If you have call chains nesting more than two or three levels, or basically any local variables (such as those char[3] arrays), you're probably going to get corrupted data in whatever you have at the higher RAM addresses. You can use the __read_stack_pointer() intrinsic in mspgcc to determine what the stack pointer is at various points, and compare that against the symbol __bss_end defined in <sys/crtld.h>. If they cross, you have a problem.

 

The names of the intrinsic and end-of-static-RAM symbol depend on the toolchain.

Link to post
Share on other sites

Sounds like it's time to move up in hardware.  One idea would be to try out the new FR4133 LaunchPad - that chip only has 16K FRAM (similar to the G2553's 16K Flash), so it's fully supported by CCS with both the TI compiler and the new RedHat GCC compiler, but it has 2KB SRAM which would solve your problem here.  Energia doesn't support it though. (That might change by the next release, but I'm not sure)

 

The F5529 LaunchPad is an option, as it has 8KB SRAM.  128KB Flash is usable with CCS with the RedHat GCC compiler, but the TI compiler will only support up to 16K usage.  Energia supports it for the lower ~48K of flash I believe.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...