From: Gordon Sande on
On 2010-04-22 07:18:31 -0300, "analyst41(a)hotmail.com"
<analyst41(a)hotmail.com> said:

> On Apr 22, 12:21�am, Dave Allured <nos...(a)nospom.com> wrote:
>> analys...(a)hotmail.com wrote:
>>
>>> I need to keep in memory an array of around 60000 character variables,
>>> each element of which can have a max length of 4000 byres. �But if yo
> u
>>> add up the lengths of all the actual data values, it is only 1/8 of
>>> 60000*4000.
>>
>>> What would be the cleanest way to store this data to take advantage of
>>> this fact?
>>
>> Something else to consider, an old fashioned yet simple approach if you
>> do not need to dynamically edit the set of strings:
>>
>> integer, parameter �:: n_strings = 60000
>> integer, parameter �:: max_len = 4000
>> integer, parameter �:: buf_size = n_strings * (max_len / 8)
>> character(buf_size) :: strs
>> character(max_len) �:: in
>> integer i, p, infile
>> integer p1(n_strings), p2(n_strings)
>>
>> p = 1
>>
>> do i = 1, n_strings
>> � �read (infile, '(a)') in
>> � �p1(i) = p
>> � �p = p + len_trim (in)
>> � �p2(i) = p - 1
>> � �strs(p1(i):p2(i)) = in
>> end do
>>
>> i = 1234
>> write (*,'(a,i0,a,a)') 'string #', i, ' = ', strs(p1(i):p2(i))
>>
>> The buffer is easily accessed and searched. �Inserting, removing, and
>> reordering are problematical. �Like others said, it depends on what you
>> need to do with the strings.
>>
>> --Dave
>
>
> At this point, all I want to do is to be able to are things like "find
> the nth element", "give me all the elements that contain a particular
> substring", "print out all zero length elements" etc.

Listing the desired operations is very good. The tough one here is
finding elements with a given substring. The suggestions that have
been given have basically been directed at symbol tables which
will match whole entries. There are fancy schemes for searching
large strings for specified substrings. Searching a bunch of smaller
strings may well have different tradeoffs. It is the stuff that one
can find in some advanced algorithms books. I believe that the folks who
do the Oxford dictionary (I am sure there are others but I do not follow
such things) have multiple papers on their advanced methods. Us mere
mortals have to trust the various "grep" libraries for the few times
the text manipulation problems go beyond symbol tables.

> Inserts etc. are handled outside the Fortran and are not an issue.
> With this method, in the event I wanted to sort the elements , it
> should be easy to do it through an array of pointers that would
> contain the sort order.
>
> Thank you and thanks to all the other responders (whose suggestions
> are a bit too advanced for me - but would be a good tutorial when I
> have the time).
>
> And yes - 240 Meg of memory is an issue, I have had the program fail
> more than once for "Image size too large".


First  |  Prev  | 
Pages: 1 2
Prev: FDIS
Next: calling a c function