|
From: Brendan on 22 Apr 2008 02:43 At a prior job, people were using std::string to hold arbitrary binary data, as opposed to vector<char>. I've seen a few people in this group poopoo that notion, and I'd like to find out what they think the pitfalls, if any, of using string to hold binary data are. I ask, because otherwise the code base was pretty high quality, and string does offer a number of extra search based member functions that vector does not. Additionally, is there any strong reason to use unsigned char as opposed to char to hold binary data where the high order bit might be set? Again, in practice I've mostly seen char used. Thanks -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Lance Diduck on 22 Apr 2008 06:19 On Apr 22, 1:43 pm, Brendan <catph...(a)catphive.net> wrote: > At a prior job, people were using std::string to hold arbitrary binary > data, as opposed to vector<char>. I've seen a few people in this group > poopoo that notion, and I'd like to find out what they think the > pitfalls, if any, of using string to hold binary data are. I ask, > because otherwise the code base was pretty high quality, and string > does offer a number of extra search based member functions that vector > does not. > > Additionally, is there any strong reason to use unsigned char as > opposed to char to hold binary data where the high order bit might be > set? Again, in practice I've mostly seen char used. > > Thanks There is nothing wrong with using std::string to hold binary data. There is nothing in std::string that assumes any particular text encoding, ispo facto std::string only holds binary data. There is a compelling reason to use unsigned -- when doing comparisions, many processors must "sign extend" char data to something the size of an int. On intel this is the movsx instruction. When unsigned, the mov is "zero extend" (Intel movzx). movzx is typically one cycle less than movsx. If you are not doing compares, tests, or such on the binary data then it doesnt make a difference. Also note that basic_string<unsigned char> will not play nice with cout. Lance -- [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Charles on 22 Apr 2008 13:27 Lance Diduck wrote: > If you are not doing compares, > tests, or such on the binary data then it doesnt make a difference. Brendan - If you _are_ doing tests and depending on the application, you should consider using the Boost dynamic_bitset library (http://www.boost.org/doc/libs/1_35_0/libs/dynamic_bitset/dynamic_bitset.html). -- Chuck [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
From: Marco Manfredini on 23 Apr 2008 05:13 Brendan wrote: > At a prior job, people were using std::string to hold arbitrary binary > data, as opposed to vector<char>. I've seen a few people in this group > poopoo that notion, and I'd like to find out what they think the > pitfalls, if any, of using string to hold binary data are. I ask, > because otherwise the code base was pretty high quality, and string > does offer a number of extra search based member functions that vector > does not. The standard asserts no complexity bounds to the string operations (exception: swap). char_traits<> has bounds given, but that isn't helping here. You may end up with an implementation which has, for example, very fast inserts, but slow replacements or the other way round. > > Additionally, is there any strong reason to use unsigned char as > opposed to char to hold binary data where the high order bit might be > set? Again, in practice I've mostly seen char used. predictable sorting order maybe or defined behavior on overflow? -- IYesNo yes=YesNoFactory.getFactoryInstance().YES; yes.getDescription().equals(array[0].toUpperCase()); [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
|
Pages: 1 Prev: C++ Memory Management Innovation: GC Allocator Next: Efficient sorting |