Are dynamic arrays in Delphi half-baked?

In a nice little series on ‘Delphi in a Unicode world’ written and published around the time of Delphi 2009’s release, Nick Hodges writes on the topic of using strings as binary buffers as thus:

A common idiom is to use a string as a data buffer. It’s common because it’s been easy — manipulating strings is generally pretty straight forward. However, existing code that does this will almost certainly need to be adjusted given the fact that string now is a UnicodeString.

There are a couple of ways to deal with code that uses a string as a data buffer. The first is to simply declare the variable being used as a data buffer as an AnsiString instead of string […] The second and preferred way dealing with this situation [, however,] is to convert your buffer from a string type to an array of bytes, or TBytes. TBytes is designed specifically for this purpose, and works as you likely were using the string type previously.

Now, I’m totally at one with those who think misusing the string type for binary buffers was a silly thing to do. Nevertheless, to say TBytes was ‘designed specifically for this purpose’ is equally as silly in my view, since in being a simple typedef for a dynamic array of bytes that was only added in D2007 (dynamic arrays themselves being added way back in D4), it patently wasn’t.

More to the point, despite having an implementation that redeployed that of the original AnsiString type for more general purposes, dynamic arrays at large — and thus, TBytes specifically — suffer from various key shortcomings in comparison:

No copy-on-write semantics. The fact that dynamic arrays and strings share key RTL functions (Copy, Length and SetLength) frequently leads me to forget this, as well as the fact that dynamic arrays aren’t in fact pure reference types in use.
The equals (=) and not equals (<>) operators compare references rather than data. (Note how the string type is simply more flexible here, since you can just cast to Pointer if you do want to compare string references.)
You can’t use the addition (+) operator. For sure, using this in a light loop is highly inefficient — but if it’s so terrible in principle, why allow it for strings? [Edit: before you get the wrong idea, see my response to Luigi Sandon — ‘LDS’ — in the comments.]
You cannot assign an array constant to a dynamic array. Cf. how there isn’t a practical distinction between string constants and string variables — they’re all just ‘strings’, and even under the hood, a string constant is just a string with a dummy reference count.
No copy-on-write semantics means you lose much of the const-ness of constant paramaters and read-only properties — basically, the consumers of an object can change the elements of a read-only dynamic array property where they can’t change the characters of a read-only string property. Admittedly, the loss of the const-ness of constant parameters is much alleviated by the open array syntax (though let’s not dilute this by encouraging the use of paramaters declared as TBytes rather than ‘const array of Byte’, eh?).* Nonetheless, it is still an unfortunate side effect of dynamic arrays not being implemented as quasi-value types, à la AnsiString and UnicodeString.

In my view, it is these features that make manipulating strings ‘pretty straight forward’, and moreover, not prone to bugs through not fully understanding the type’s internal semantics. The fact that dynamic arrays do not have them, then, makes the idea of TBytes being some sort of genuine substitute for the misused old AnsiString quite false. That said, one particular issue with dynamic arrays especially gets my beef, but I’ll leave elucidating that to another time…

* Thus:

procedure Test(const Arg1: TBytes; const Arg2: array of Byte);
begin
  Arg1[0] := 99; //compiles!
  Arg2[0] := 99; //doesn't compile
end;

47 thoughts on “Are dynamic arrays in Delphi half-baked?”

Mason Wheeler says: 4 December, 2009 at 8.19 pm

Using a TBytes to replace strings as binary buffers? OK, I’m a bit confused here. Isn’t that what RawByteString is there for?

Reply
- CR says: 4 December, 2009 at 8.27 pm
  
  Mason — RawByteString isn’t for that at all. Rather, its purpose is to serve as the parameter type for when you want a routine to accept an AnsiString with any given code page (use just AnsiString, and there will be an implicit conversion if the input string does not have the system code page). In short, RawByteString is equivalent to the old AnsiString in one context only, namely as a parameter type. The fact that you can declare a variable (and not just a parameter) with the type of RawByteString is just a limitation of the Delphi type system.
  
  Reply
  - Mason Wheeler says: 4 December, 2009 at 8.54 pm
    
    Even so, it works really well as a binary buffer where strings used to. Definitely better than TBytes does!
    
    Reply
    - CR says: 4 December, 2009 at 10.30 pm
      
      You’re still thinking of it in the wrong way. A RawByteString is *not* an AnsiString that has no defined codepage. Rather, it’s an AnsiString that takes on the codepage of whatever string is assigned to it.
      
      Reply
      - Arioch says: 18 February, 2013 at 10.20 am
        
        then at least he name is intentionally misguided. Name does not tell “any-charset-AnsiString” but does tell “raw bytes buffer”
LDS says: 4 December, 2009 at 9.24 pm

Frankly, it looks alike “lazy programming” to me. When handling binary buffers, there are far better techniques than simple arrays and + operators to add an element re-allocating memory over and over. The right one to use depends on how the buffer itself is used. Some string manipulation facilities are there to ease simple string manipulation, but an application performing heavily string processing won’t use them anyway, or the price in term of performance would be very high. Try to use a TMemoryStream too and use Write(Buf, 1) without setting the stream size and the setting it and look at performance…
Unluckily the pre-web programmer was much more skilled in using the proper algorithms to threat the proper data. Then came the web and Javascript and everything became a string. I saw a PHP programmer performing bitwise operation using a string of ‘0’ and ‘1’ characters. And I had in my office a Java programmer attempting it – it desisted when I was about to throw a Knuth book at him…
Remember Delphi is a OO language – a binary buffer can be incapsulated into the proper class and some methods to ease data manipulation written. Use string for what they are.

Reply
- CR says: 4 December, 2009 at 10.37 pm
  
  ‘When handling binary buffers, there are far better techniques than simple arrays and + operators to add an element re-allocating memory over and over.’
  
  And where exactly was I advocating that? (Cf. the comment in the original post about ‘tight loops’.) A more reasonable senario would be constructing (or adding to) an array from a variety of sources in one go –
```
MyRects := MyRects + Ctrl1.BoundsRect + Ctrl2.BoundsRect + Rect(0, 0, 10, 10) + RectConst; 
```
  which would be a direct shortcut for
```
OldLen := Length(MyRects);
SetLength(MyRects, OldLen + 4;
MyRects[OldLen] := Ctrl1.BoundsRect;
MyRects[OldLen + 1] := Ctrl2.BoundsRect;
MyRects[OldLen + 2] := Rect(0, 0, 10, 10);
MyRects[OldLen + 3] := RectConst;
```
  ‘Try to use a TMemoryStream too and use Write(Buf, 1) without setting the stream size and the setting it and look at performance…’
  
  Actually, that’s less of an issue in a real-world situation, since TMemoryStream expands its capacity intelligently, or at least, not as stupidly as you imply (check out the source — of course, if you really were to input one byte at a time, manually setting Capacity to something appropriate beforehand would be a good idea).
  
  ‘Remember Delphi is a OO language – a binary buffer can be incapsulated into the proper class and some methods to ease data manipulation written.’
  
  Java-bred weanie! What’s wrong with the ‘program in Delphi as if it were C’ approach? 😉 More seriously, I’m not sure where you got the idea I would disagree with that. Cf. this unit (TExifTag is the ultimate container class, and line 189 has my choice of ultimate data type…)
  
  ‘Use string for what they are.’
  
  Urgh, I wasn’t advocating the use of strings for binary buffers. As I wrote – ‘Now, I’m totally at one with those who think misusing the string type for binary buffers was a silly thing to do. Nevertheless, to say TBytes was ‘designed specifically for this purpose’ [of being the container type for binary buffers] is equally as silly in my view…’ My aim was to criticise Delphi’s half-arsed dynamic array implementation, not praise misbegotten ‘alternatives’.
  
  Reply
  - LDS says: 5 December, 2009 at 1.44 pm
    
    Did you study linear algebra? Your meaning of the operator “+” is not the only one possibile. You mean concatentation only, but with vectors and matrices it usually has a very different meaning. How would concatentation work with multidimensional arrays? Which dimensions should it grow? Strings at last are unidimensional and have a simpler meaming for “+”. Even vectors and matrices may not be the right choice for arbitrary binary buffers. Buffer are “chunks of memory”, and should be defined as such.
    My example about TMemoryStream just show how copy on write and even expanding a buffer over and over can be a performance killer. Easy to use, but beware of the implications. Many programmers seems “lazy” and prefer the simpler approach to avoid proper buffer management.
    And if you’re going to use the “C” approach, you would use GetMem/malloc, casts, and pointer arithmetics, not arrays. And to hide this complexity a good class is IMHO the way to go – and I started programming much before OO and Java appeared – but I see no reason why I shoud not get advantage of a cleaner OO approach.
    
    Reply
    - CR says: 5 December, 2009 at 2.19 pm
      
      ‘Your meaning of the operator “+” is not the only one possibile.’
      
      And where did I say it was?
      
      ‘You mean concatentation only’
      
      Given the context was functionality on the string type missing on dynamic arrays (and in particular, dynamic array one might use as a substitute for strings misused as binary buffers), wasn’t that obvious? Oddly enough, it seems it *was* obvious before your intial reply to me, otherwise your comment about adding data to a stream one byte at a time wouldn’t have made much sense.
      
      > but with vectors and matrices it usually has a very different meaning. How would concatentation work with multidimensional arrays?’
      
      Well, as you bring it up: by default it shouldn’t be supported, but if we’re now talking about hypothetical development of the language, why not allow operator overloading on array types a la that which was given to record types in D2005? For sure, to be efficient, the current syntax for records would need to be expanded a bit (IIRC, you can only handle adding two instances at a time, which means the custom handler is called multiple times for code such as Inst1+Inst2+Inst3+…).
      
      ‘And if you’re going to use the “C” approach, you would use GetMem/malloc, casts, and pointer arithmetics, not arrays.’
      
      WTF – where on earth did I imply otherwise? You also appear to have missed the smiley. That said, the distinction between typed pointers and arrays is obviously moot in C anyway…
      
      ‘And to hide this complexity a good class is IMHO the way to go – and I started programming much before OO and Java appeared – but I see no reason why I shoud not get advantage of a cleaner OO approach.’
      
      Right, which is why I linked to that unit of mine, a unit that, while being somewhat convoluted and having its own purpose (obviously), is an example of what you are saying — data is ultimately held via bare pointers, and is exposed via a series of methods and properties.
      
      Reply
      - LDS says: 5 December, 2009 at 10.16 pm
        
        > And where did I say it was?
        >”wasn’t that obvious?
        🙂
        > it seems it *was* obvious before your intial reply to me
        Yes – but I took the time to look at it from a broader perspective. You looked at arrays as binary data buffers only, but they are not that. They are mathematical objects with their own rules.
        > otherwise your comment about adding data to a stream one byte at a time
        That was an example to show how “copy-on-write” semantic may not be the right one. While string are usually relatively small, arrays may tens or hundreds of MB. Are you sure copy-on-write is the best way to handle them?
        >why not allow operator overloading on array types
        It would be alike allowing operator overloading for integers or float types. Feasible, but it could introduce very subtle behaviours. Probably, it would be better to introduce a concatentation operator.
        >You also appear to have missed the smiley. That said, the distinction
        Sorry, I misunderstood you. Yes, in C you can “index” a pointer and access the n-th element. That’s something I would like to see in Delphi too. That way it would be easier to use GetMem to allocate a buffer (as long as Hodges doesn’t remove it) and access data without using casts. But remembering that buffers, strings and arrays are different kind of data structure although accessed in a similar way.
      - CR says: 6 December, 2009 at 1.23 pm
        
        LDS —
        
        It [i.e., operator overloading on array types] would be alike allowing operator overloading for integers or float types
        
        Not really IMO, given arrays don’t support the operator at all at the present, just like records didn’t before D2005. As you say, when talking about multidimensional arrays especially, there’s no obvious default meaning for the plus operator, which is quite different from the case of integer and float types. In being undefined at the present, then, having the operator work will give rise to the expectation that there must be custom behaviour in effect, an expectation that wouldn’t arise in the integer or float case.
        
        Yes, in C you can “index” a pointer and access the n-th element. That’s something I would like to see in Delphi too.
        
        Given this was added to all typed pointer types in D2009, what’s the benefit of adding it to Pointer too? Allen Bauer’s explanation here seems conceptually sound to me (‘Variables of type “Pointer” do not allow the pointer math features since it is effectively pointing to a “void” element which is 0 size’). Personally, the hand-holding involved in having a compiler directive rather than just making enabling it by default for all typed pointers is the (slightly) larger issue, though it’s good the Delphi team finally saw the benfit of allowing it at all.
        
        That way it would be easier to use GetMem to allocate a buffer […] and access data without using casts.
        
        I don’t quite follow. For me, the main source of potential bugs in this area is GetMem and (in particular) ReallocMem taking the byte size as their length parameter — basically, it would be nice to have ‘GetTypedMem’ and ‘ReallocTypedMem’ standard procedures that would understand typed pointers a la New, Dispose, Inc and Dec.
Serg says: 4 December, 2009 at 11.18 pm

The implementation of dynamic arrays must be improved. I miss copy-on-write arrays (though I also need dynamic arrays without copy-on-write semantics, both kinds must be available). Other array shortcomings listed in post are less important for me but I would like to see them improved also. And that is more important than just using dynamic arrays as binary buffers.

Reply
- CR says: 5 December, 2009 at 12.30 am
  
  ‘The implementation of dynamic arrays must be improved.’
  
  Unless the language is itself ‘reset’ somewhat, I’d assume this won’t and (alas) shouldn’t happen for backwards compatibility reasons.
  
  ‘I miss copy-on-write arrays (though I also need dynamic arrays without copy-on-write semantics, both kinds must be available).’
  
  Why both? It would only complicate the language unnecessarily (cf. the all-to numerous number of string types). If we could go back in time, I’d have dynamic arrays given copy-on-write semantics only, since if you want pure reference type behaviour, you can use pointers. Don’t forget that copy-on-write also has the benefit of making const parameters and read only properties what they should be — constant and read only.
  
  Reply
  - Serg says: 6 December, 2009 at 4.41 pm
    
    “If we could go back in time, I’d have dynamic arrays given copy-on-write semantics only, since if you want pure reference type behaviour, you can use pointers.”
    If we could go back in time, first of all I’d have changed the assignment semantics, i.e.
    var
    A, B: array of Integer;
    ..
    A:= [1, 3, 5];
    B:= A; // here the array data itself is assigned, not reference to A
    
    Copy-on-write reference assignment for dynamic arrays is good compared with the current implementation (reference assignment without copy-on-write), but often all you need is the data assignment without reference-counting overhead, just like the ordinary array assignments.
    
    Reply
    - CR says: 6 December, 2009 at 11.18 pm
      
      If we could go back in time, first of all I’d have changed the assignment semantics, i.e.
      var
      A, B: array of Integer;
      ..
      A:= [1, 3, 5];
      B:= A; // here the array data itself is assigned, not reference to A
      
      Why *less* than copy on write, given the latter had already been implemented for AnsiString, and dynamic arrays were reusing part of the AnsiString implementation anyway?
      
      Copy-on-write reference assignment for dynamic arrays is good compared with the current implementation (reference assignment without copy-on-write), but often all you need is the data assignment without reference-counting overhead, just like the ordinary array assignments.
      
      You can get that now — just call the version of Copy that takes a single parameter:
      
      B := Copy(A);
      
      Reply
      - Serg says: 7 December, 2009 at 3.41 am
        
        “You can get that now — just call the version of Copy that takes a single parameter:
        
        B := Copy(A);”
        
        You can’t do it for record assignments, eg
        
        type
        TMyInt = record
        Used: Integer;
        Data: array of Longword;
        …
        end;
        
        var A,B: TMyInt;
        
        A:= 1;
        B:= A;
        
        You can write your own assignment method such as B:= A.Copy, but if you are implementing an integer ariphmetics would you like to explain that writing
        
        B:= A;
        
        is an error?
      - CR says: 7 December, 2009 at 8.59 pm
        
        You can’t do it for record assignments
        
        I thought we were talking about dynamic arrays…?
      - Serg says: 8 December, 2009 at 7.12 am
        
        “I thought we were talking about dynamic arrays…?”
        
        Yes I am talking about dynamic array as a record field. If you assign such a record only reference to dynamic array is assigned, and without copy-on-write that is VERY bad.
        
        type
        TMyInt = record
        Used: Integer;
        Data: array of Longword;
        end;
        ..
        var
        A, B: TMyInt;
        ..
        
        A:= B; // here the reference to B.Data is assigned to A.Data, and without copy-on-write semantics. If you change later A.Data, B.Data is also changed since both reference the same data.
        
        Normally declaring a dynamic array as record field you expect the array data assignment instead of reference assignment.
        
        That is why I say that
        1) reference assignment without copy-on-write is the worst
        2) reference assignment with copy-on-write is better
        3) array data assignment is the best
        
        I think that the best dynamic array implementation in the delphi compiler must resemble ordinary static array as much as possible.
      - CR says: 12 December, 2009 at 1.25 pm
        
        ‘If you assign such a record only reference to dynamic array is assigned, and without copy-on-write that is VERY bad.’
        
        Why’s it ‘very’ bad if one realises a dynamic array is essentially (if not completely!) a reference type? Cf. the case of a record containing a object field — you wouldn’t want the compiler to try and figure out how to make a value assignment of the object data.
        
        ‘I think that the best dynamic array implementation in the delphi compiler must resemble ordinary static array as much as possible.’
        
        I don’t disagree with that, though the type incompatibility between the two outside of open array parameters is a bigger annoyance for me.
        
        (By the by, apologies for the crappy formatting of nested comments. As I’m just using hosted WordPress, I can’t change it, unfortunately, without changing the whole blog theme.)
Xepol says: 5 December, 2009 at 1.07 am

First off, before unicode, using a string as a databuffer was not silly – it was a natural fit with zero downsides. You got automatic data size handling, automatic garbage collection, a pile of fuctions for working with data subsets and direct indexed access to the contents. Plus the copy on write semantics meant that multiple references to the data buffer did not waste a lot of ram.

TBytes is a poor choice BECAUSE it is only available from D2007 and on – if you have to support something older, you either can’t use it, or you have to reinvent the wheeel for older versions of the compiler.

As for using AnsiStrings – this is a suicidal choice for a data buffer now because of the codepage magic that goes on in the background. Data buffers should not be able to magically change their contents. Think it won’t happen? Think again. Most of the string handling routines – even those that claim to be for AnsiString types, are actually for unicode strings – which *WILL* lead to code page changes sooner or later and likely only in non standard locations making the problem hard to track down.

The ONLY trustworthy option is RawByteString – code page adjustments are disallowed, it definitely can not be treated as UTF encoded data and it is easy to back port this to old compilers with a verion sensitive IFDEF retying RawByteString as String.

I find it intersting that even now, 2 compiler versions after unicode strings were introduced, we are still discussing this. Why are we discussing this? Because unicode strings and by extension AnsiStrings are inherently UNTRUSTWORTY datatypes. Your data can change on you without that being your intention, and the “seamless upgrade” caused more problems that it ever hoped to fix.

While Unicode might be a good idea, the way it was implemented in Delphi was definitely not. As a result, these types of issues will keep coming up over and over again.

Reply
- runner says: 5 December, 2009 at 9.35 am
  
  Using AnsiString as binary buffer was not ok in my opinion. You must be aware of the things that can happen in the future if you missuse the technology. AnsiString was not designed for that. Yes it was convenient. Very. But that is no exuse for missusing it. I wrote about converting cryptographic algorithms to unicode and made it clear how it should be done:
  
  http://www.cromis.net/blog/2009/11/how-to-correctly-handle-cryptography-in-ansi-and-unicode-flavor/
  
  And I suspect we will have similar discussion about class helpers and 64 bit pointers sometime in the future. I had a similar incident recently and I blame my lazy ass solely for that.
  
  What happened? I had a hash table that accepted objects. But I wanted to store some interfaces in it. So instead of making a wrapper object for the interface I did this:
  
  FHashTable.Item[‘Key’] := TObject(Pointer(MyInterface));
  
  And vice verse on the other end. It worked well and the interface would never get released in the meantime (yes I know reference count will not increment in this way), I made sure there was at least one reference somwhere else. But what happened was that I started getting strange behaviour under Delphi 2010 compiler. Probably because they played with interface to object blind casting.
  
  But still the blame is no me. It is not a proper way to do what I wanted.
  
  Reply
  - CR says: 5 December, 2009 at 11.20 am
    
    ‘Using AnsiString as binary buffer was not ok in my opinion.’
    
    Right, but TBytes isn’t a very good alternative, which was the point I was making. (I see you don’t appear to think that yourself however, given the code in that post you linked to.) The best way IMO remains what it was in the earliest versions of Delphi, which is to use a manually-managed buffer internally (ReallocMem is your friend…) and expose high-level properties and methods rather than the buffer directly.
    
    Reply
    - runner says: 5 December, 2009 at 11.36 am
      
      I agree that TBytes is not as elegant, sure. And I agree that some higher level code should be introduced. The code, I have written about, could use some higher level manipulation. But Move was more than enough in that case. I love simplicity and elegance and somethimes working with arrays is a pain.
      
      But I had a different point there. Cryptographic data should only be threated as binary data and strings should be converted with helper functions at a higher level. I suppose there is some analogy to what you are saying. There are to many pitfalls if you are messing with strings directly in cryptography. I have made some of that mistakes myself. As I said if I am shown a better and safer approach I will gladly use it. But as for now I see my approach as the safest one (not the easiest)
      
      Reply
      - CR says: 5 December, 2009 at 11.52 am
        
        ‘I agree that TBytes is not as elegant, sure.’
        
        In an ideal world, you could use a TAnsiCharDynArray type (defined as just a standard dynamic array of AnsiChar), which would do everything as the old AnsiString but for conversions from PAnsiChar values.
        
        ‘But I had a different point there. Cryptographic data should only be threated as binary data and strings should be converted with helper functions at a higher level.’
        
        Sure.
  - Arioch says: 18 February, 2013 at 10.42 am
    
    Now waiting for PChar/PAnsiChar/PWideChar catch up for strings and uniformly with them start “nder the hood” charset conversions.
    
    And then calling “C_DLL_API(PAnsiChar(UTF8Encode(bla-vla)); would suddenly start to mess things.
    
    Like a particular case of anticipating future changes
    
    Reply
    - Chris Rolliston says: 18 February, 2013 at 1.11 pm
      
      ‘Now waiting for PChar/PAnsiChar/PWideChar catch up for strings and uniformly with them start “nder the hood” charset conversions.’
      
      Sorry, I’ve no idea what you’re talking about. PChar/PAnsiChar/PWideChare are typed pointer types, and such, are codepage agnostic.
      
      Reply
      - Arioch says: 18 February, 2013 at 1.26 pm
        
        But UnicodeStringVar := PAnsiCharVar; statement has a very special implementation, quite different from true typed pointer like UnicodeStringVar := PIntegerVar;
      - Chris Rolliston says: 18 February, 2013 at 1.34 pm
        
        I wouldn’t call it ‘very special’ myself – that and SetString do the wrong thing, though this has little to do with D2009. The behaviour goes way back, and assumes there can be only one active ‘Ansi’ codepage – the PAnsiChar itself holds no codepage information – how could it? In contrast, AnsiString UnicodeString conversions work with the codepage data held by the AnsiString instance in question.
      - Arioch says: 18 February, 2013 at 7.49 pm
        
        “assumes there can be only one active ‘Ansi’ codepage – the PAnsiChar itself holds no codepage information – how could it?”
        
        Exactly the words that were told about AnsiString back before 2009 by everyone who used them as buffer. Really, how else it could possibly be ? Then… then one day it just changed.
        
        They may introduce PUTF8Char/POEMChar/whatever. As a very simplistic example.
- CR says: 5 December, 2009 at 11.35 am
  
  ‘First off, before unicode, using a string as a databuffer was not silly’
  
  Yes it was. A string value should hold a string.
  
  ‘You got automatic data size handling, automatic garbage collection, a pile of fuctions for working with data subsets and direct indexed access to the contents. Plus the copy on write semantics meant that multiple references to the data buffer did not waste a lot of ram.’
  
  Which is where my criticism of the half-hearted implementation of dynamic arrays comes in. You seem to be conflating two issues here: the power of the Delphi string type since D2, and the lameness of dynamic arrays.
  
  ‘TBytes is a poor choice BECAUSE it is only available from D2007 and on’
  
  Crap – TBytes is merely a typedef for ‘array of Byte’. The real reason TBytes is a poor choice is because the implementation of dynamic arrays is so half-arsed. IMO, it would have been better for TBytes to have been a new type in its own right, giving you copy on write etc., so that it could be a genuine substitute for the misused string type.
  
  ‘The ONLY trustworthy option is RawByteString’
  
  Er, no. See my reply to Mason earlier. RawByteString is *only* equivalent to the old AnsiString as a parameter type. The exact same code page conversion issues you mentioned for the new AnsiString can apply to RawByteString too if you’re not ultra-careful.
  
  ‘I find it intersting that even now, 2 compiler versions after unicode strings were introduced, we are still discussing this. Why are we discussing this?’
  
  Look, my post wasn’t really about Unicode or D2009 *at all*. Rather, it was about the dynamic array implementation introduced in *D4*, over a decade ago! As for why I’m still discussing *that*, well the main reason is that I thought it was time to update this blog with a post other than one which says ‘here’s an update to my Exif parsing code’…
  
  ‘While Unicode might be a good idea, the way it was implemented in Delphi was definitely not.’
  
  I totally disagree with the second part of that sentence.
  
  Reply
  - runner says: 5 December, 2009 at 11.49 am
    
    @CR
    
    Now that I read this response I think we are on the same boat. I would love to see arrays more powerfull. I like the strings and arrays in .NET and how they are implemented as classes (yes there is a big difference in impementation)
    
    But, what we have now is what me must use. 🙂
    
    Reply
    - CR says: 5 December, 2009 at 11.58 am
      
      ‘Now that I read this response I think we are on the same boat. ‘
      
      Ah, good. My original post was probably a bit confusing given the way I set it up.
      
      Reply
    - Dimitrij says: 6 December, 2009 at 6.45 pm
      
      They are implemented as classes and thus extremaly slow.
      I love the arrays like they are now. Please don’t make Delphi another .NET implementation.
      
      And this prior to .NET 2.0 treatment simple types as objects was ridiculous!
      
      Reply
      - CR says: 6 December, 2009 at 11.20 pm
        
        Huh? What’s either .Net or making arrays classes got do with any of the discussion here…?
      - Dimitrij says: 7 December, 2009 at 7.25 am
        
        I’ve replied to the runner’s post.
        
        [Apologies – CR]
      - runner says: 7 December, 2009 at 8.59 am
        
        Ok in .NET arrays and strings are slower because of other reasons also not only classes. Strings are immutable and that is one of the reasons.
        
        I am not advocating classes per se, it could be records or anything else. I would also be happy with simple RTL functions that would operate on arrays. But as it is now we have almost nothing as far as arrays are concerned. Yes I can write it myself, but I have verry little time and there are a lot of things to be done.
        
        And no I don’t want another .NET clone, otherwise I would not be coding in Delphi anymore. I love its raw power.
Jolyon Smith says: 6 December, 2009 at 7.05 pm

Seems to me there’s a lot of surprising gaps inpeople’s knowledge, from people who are otherwise very knowledgeable.

CF >> Yes, in C you can “index” a pointer and access the n-th element.
CF >> That’s something I would like to see in Delphi too.

LDS > Given this was added to all typed pointer types in D2009

AFAIK you have *always* been able to index a typed pointer in Delphi. What was added in D2009 was the ability to enable pointer ARITHMETIC on all typed pointers.

And for everyone advocating encapsulating buffers inside OO… that’s all fine and dandy, but the subject of this post is really about how that encapsulation is then able to work most effectively/intuitively/efficiently with it’s INTERNAL REPRESENTATION of whatever data storage it encapsulates and exposes.

And at some point someone will say… “and I need a reference to/a copy of the contents of the buffer in the encapsulation to pass to this that or the other external API which expects a PChar/Pointer/String/etc”.

The problem with discussing points in general is that it’s all too easy to neglect the influence that the aspects of the particular always exert over any “ideal” approach.

Reply
- CR says: 6 December, 2009 at 11.44 pm
  
  AFAIK you have *always* been able to index a typed pointer in Delphi. What was added in D2009 was the ability to enable pointer ARITHMETIC on all typed pointers.
  
  You are wrong – it was only allowed for PAnsiChar/PWideChar/PChar, hence PByteArray etc. in SysUtils. In Delphi-land, pointer indexing and arithmetic have come together. Try this code in D2007 or earlier if you don’t believe me:
```
procedure Test;
var
  PA: PAnsiChar;
  PW: PWideChar;
  PC: PChar;
  PB: PByte;
begin
  PA[1] := 'T';
  PW[1] := 'e';
  PC[1] := 's';
  PB[1] := Ord('t'); //compiler error (E2016): 'array type required'
end;
```
  And at some point someone will say… “and I need a reference to/a copy of the contents of the buffer in the encapsulation to pass to this that or the other external API which expects a PChar/Pointer/String/etc”.
  
  Well the format won’t be too complicated then if the external API is expecting a PChar, Pointer or string (!). That said, surely any ‘external API’ that wants to work with the internal data structures of the calling application will always be a pig to work with, regardless of the particular way those structures are implemented.
  
  Reply
  - Jolyon Smith says: 7 December, 2009 at 7.08 pm
    
    Ah yes – 2nd axiom of Skitts Law rears it’s ugly head again. 🙂
    
    I was thinking of the clearly HUGE AND INSURMOUNTABLY DIFFICULT EFFORT of declaring a mock array type and a pointer that array and deriving pointer indexing in that way. A technique that for me seems to natural and trivial in effort, that it might as well be considered part of the language.
    
    For some people, if the language doesn’t support it directly with zero effort on their part, then I guess they’d much rather just throw their hands up in disgust and refuse on principle to type in the handful of characters required to EXTEND their toolkit in the way that the to suit them.
    
    One thing I am certain of however is that Inc/Dec have long since worked with typed pointers in terms of the size of the value pointed to by them, which is often directly substitutable in many instances where pointer indexing is itself useful (e.g in loops iterating over some data via pointer).
    
    Reply
    - CR says: 7 December, 2009 at 9.35 pm
      
      Right, so you agree you were wrong then! 😉
      
      I was thinking of the clearly HUGE AND INSURMOUNTABLY DIFFICULT EFFORT of declaring a mock array type and a pointer that array and deriving pointer indexing in that way. A technique that for me seems to natural and trivial in effort, that it might as well be considered part of the language.
      
      And, ODDLY ENOUGH, I AM AWARE of the dummy array type technique (cf. this post, admittedly not on the front page yet even though I wrote it before the present one.)
      
      One thing I am certain of however is that Inc/Dec have long since worked with typed pointers
      
      Correct, hence it was a bit silly not to enable indexing on them too.
      
      Reply
- runner says: 7 December, 2009 at 8.25 am
  
  But isn’t that something that TMemoryStream does now. It encapsulates a block of memory at a higher OO level, but you can still get a poitner to raw memory when needed.
  
  Reply
Pingback: A String of Byte? « The Programming Works
Pingback: Things that make you go ‘urgh’… « Delphi Haven
Arioch says: 18 February, 2013 at 10.43 am

what did you meant by “dynamic arrays aren’t in fact pure reference types in use” ?

Reply
Arioch says: 18 February, 2013 at 10.51 am

Funny thing, in this article u miss
1) one big misfeature of dynamic array – unneeded counter-intuitive assignment-incompatibility of two dynarrays. Actually the reason Delphi 7 introduced some subset of TxxxDynArray.

2) Generic arrays since D2009.

Funny thing is that You have TCharArray typedef in D2009+ – but it is mapped to new generic array rather than old dynarray.

3) unneeded and counterintuitive differentiation between dynarrays and generic arrays. http://stackoverflow.com/questions/11029353

Fix that, and i think all or most of your points could be solved as implementations behind TArray or TList

Reply
- Chris Rolliston says: 18 February, 2013 at 11.44 am
  
  Er, the blog post you refer to was written in 2009.
  
  Reply
  - Arioch says: 18 February, 2013 at 11.59 am
    
    yes, but a blog post is not an egg, it is not got rotten just because of time
    
    Reply