extended operator classes vs. type interfaces [PgSql]

Prev: a faster compression algorithm for pg_dump
Next: [HACKERS] How to modify default Type (TSQuery) behaviour?

From: Yeb Havinga on 9 Apr 2010 10:53

Robert Haas wrote:
> On Fri, Apr 9, 2010 at 10:33 AM, Robert Haas <robertmhaas(a)gmail.com> wrote:
>
>> On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
>>
>>> Robert Haas wrote:
>>>
>>>> Under the first type [4pm,5pm) =
>>>> [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].
>>>>
>>>> Thoughts?
>>>>
>>>>
>>> The examples with units look a lot like the IVL<PQ> datatype from HL7, see
>>> http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.htm
>>>
>>> About a type interface, the HL7 spec talks about promotion from e.g. a
>>> timestamp to an interval (hl7 speak for range) of timestamps (a range), and
>>> demotion for the back direction. Every 'quantity type', which is any type
>>> with a (possibly partially) lineair ordered domain, can be promoted to an
>>> interval of that type. In PostgreSQL terms, this could perhaps mean that by
>>> 'tagging' a datatype as a lineair order, it could automatically have a range
>>> type defined on it, like done for the array types currently.
>>>
>> The way we've handled array types is, quite frankly, horrible. It's
>> bad enough that we now have two catalog entries in pg_type for each
>> base type; what's even worse is that if we actually wanted to enforce
>> things like the number of array dimensions we'd need even more - say,
>> seven per base type, one for the base type itself, one for a
>> one-dimensional array, one for a two-dimensional array, one for a
>> three-dimensional array. And then if we want to support range types
>> that's another one for every base type, maybe more if there's more
>> than one kind of range over a base type. It's just not feasible to
>> handle derived types in a way that require a new instance of each base
>> type to be created for each kind of derived type. It scales as
>> O(number of base types * number of kinds of derived type), and that
>> rapidly gets completely out of hand
>>
>
> ...which by the way, doesn't mean that your idea is bad (although it
> might not be what I would choose to do), just that I don't think our
> current infrastructure can support it.
>
Well yeah the idea was to 'automagically' have a range type available,
if the underlying type would support it, i.e. has a lineair order and
therefore <,>,= etc defined on it, "just like the array types", from a
user / datatype developer perspective.

From the implementers perspective, IMHO an extra catalog entry in
pg_type is not bad on its own, you would have one anyway if the range
type was explicitly programmed. About different kinds of range types - I
would not know how to 'promote' integer into anything else but just one
kind of 'range of integer' type. So the number of extra pg_types would
be more like O(number of linear ordered base types).

regards,
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Yeb Havinga on 9 Apr 2010 11:07

> From the implementers perspective, IMHO an extra catalog entry in
> pg_type is not bad on its own, you would have one anyway if the range
> type was explicitly programmed. About different kinds of range types -
> I would not know how to 'promote' integer into anything else but just
> one kind of 'range of integer' type. So the number of extra pg_types
> would be more like O(number of linear ordered base types).
... I now see the example of different ranges in your original mail with
different unit increments. Making that more general so there could be
continuous and discrete ranges and for the latter, what would the
increment be.. OTOH is a range of integers with increment x a different
type from range of integers with increment y, if x<>y? Maybe the
increment step and continuous/discrete could be typmods.

regards
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 9 Apr 2010 11:14

On Fri, Apr 9, 2010 at 11:07 AM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
>
>> From the implementers perspective, IMHO an extra catalog entry in pg_type
>> is not bad on its own, you would have one anyway if the range type was
>> explicitly programmed. About different kinds of range types - I would not
>> know how to 'promote' integer into anything else but just one kind of 'range
>> of integer' type. So the number of extra pg_types would be more like
>> O(number of linear ordered base types).
>
> .. I now see the example of different ranges in your original mail with
> different unit increments. Making that more general so there could be
> continuous and discrete ranges and for the latter, what would the increment
> be.. OTOH is a range of integers with increment x a different type from
> range of integers with increment y, if x<>y? Maybe the increment step and
> continuous/discrete could be typmods.

Nope, not enough bits available there. This is fundamentally why the
typid/typmod system is so broken - representing a type as a fixed size
object is extremely limiting. A fixed size object that MUST consist
of a 32-bit unsigned OID and a 32-bit signed integer is even more
limiting. Fortunately, we don't need to solve that problem in order
to implement range types: we can just have people explicitly create
the ones they need. This will, for example, avoid creating ranges
over every composite type that springs into existence because a table
is created, even though in most cases a fairly well-defined range type
could be constructed.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Joe Conway on 9 Apr 2010 12:16

On 04/09/2010 07:33 AM, Robert Haas wrote:
> On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
>> 'tagging' a datatype as a lineair order, it could automatically have a range
>> type defined on it, like done for the array types currently.
>
> The way we've handled array types is, quite frankly, horrible. It's
> bad enough that we now have two catalog entries in pg_type for each
> base type; what's even worse is that if we actually wanted to enforce
> things like the number of array dimensions we'd need even more - say,
> seven per base type, one for the base type itself, one for a
> one-dimensional array, one for a two-dimensional array, one for a
> three-dimensional array. And then if we want to support range types
> that's another one for every base type, maybe more if there's more
> than one kind of range over a base type. It's just not feasible to
> handle derived types in a way that require a new instance of each base
> type to be created for each kind of derived type. It scales as
> O(number of base types * number of kinds of derived type), and that
> rapidly gets completely out of hand

Perhaps off the original topic (and thinking out loud), but I agree with
you on the handling of array types. I have long thought (and at least
once played with the idea) that a single array type, anyarray, made up
of elements, anyelement, could be made to work. Further, anyelement
should be defined to be any valid existing type, including anyarray.
Essentially, at least by my reading of the SQL spec, a multidimensional
array ought to be an array of arrays, which is different in subtle ways
from what we have today.

Joe

From: "Kevin Grittner" on 9 Apr 2010 13:13

Robert Haas <robertmhaas(a)gmail.com> wrote:

> Given a type T, I think we'd like to be able to define a type U as
> "the natural type to be added to or subtracted from T". As Jeff
> pointed out to me, this is not necessarily the same as the
> underlying type. For example, if T is a timestamp, U is an
> interval; if T is a numeric, U is also a numeric; if T is a cidr,
> U is an integer. Then we'd like to define a canonical addition
> operator and a canonical subtraction operator.

As it is de rigueur for someone to escalate the proposed complexity
of an idea by at least one order of magnitude, and everyone else has
fallen down on this one: ;-)

I've often thought that if we rework the type system, it would be
very nice to support a concept of hierarchy. If you could
"subclass" money to have a subclass like assessable, which in turn
has subclasses of fine, fee, restitution, etc. you could then
automatically do anything with a subclass which you could do with
the superclass, and support such things as treating the sum of
various classes as the lowest common subclass. It seems like this
sort of approach, if done right, might allow some easier way to
establish sensible operations between types (like distance / speed =
time or speed * time = distance).

Just a thought....

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: a faster compression algorithm for pg_dump
Next: [HACKERS] How to modify default Type (TSQuery) behaviour?