|
Prev: Proposal of SE-PostgreSQL patches [try#2]
Next: Dept of ugly hacks: eliminating padding spacein system indexes
From: Tom Lane on 23 Jun 2008 15:52 I was thinking a bit about how we pad columns of type NAME to fixed-width, even though they're semantically equivalent to C strings. The reason for wasting that space is that it makes it possible to overlay a C struct onto the leading columns of most system catalogs. I don't wish to propose changing that (at least not today), but it struck me that there is no reason to overlay a C struct onto index entries, and that getting rid of the padding space would be even more useful in an index than in the catalog itself. It turns out to be dead easy to implement this: effectively, we just decree that the index column storage type for NAME is always CSTRING. Because the two types are effectively binary-compatible as long as you don't look at the padding, the attached ugly-but-impressively-short patch seems to accomplish this. It passes the regression tests anyway. Here are some numbers about the space savings in a virgin database: CVS HEAD w/patch savings pg_database_size('postgres') 4439752 4071112 8.3% pg_relation_size('pg_class_relname_nsp_index') 57344 40960 28% pg_relation_size('pg_proc_proname_args_nsp_index') 319488 204800 35% Cutting a third off the size of a system index has got to be worth something, but is it worth a hack as ugly as this one? regards, tom lane
From: Tom Lane on 23 Jun 2008 19:09 Mark Mielke <mark(a)mark.mielke.cc> writes: >> Tom Lane wrote: >>> Cutting a third off the size of a system index has got to be worth >>> something, but is it worth a hack as ugly as this one? > Were you able to time any speedup? I didn't try; can you suggest any suitable benchmark? The performance impact is probably going to be limited by our extensive use of catalog caches --- once a desired row is in a backend's catcache, it doesn't take a btree search to fetch it again. Still, the system indexes are probably "hot" enough to stay in shared buffers most of the time, and the smaller they are the more space will be left for other stuff, so I think there should be a distributed benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 23 Jun 2008 19:45 Simon Riggs <simon(a)2ndquadrant.com> writes: > On Mon, 2008-06-23 at 15:52 -0400, Tom Lane wrote: >> Cutting a third off the size of a system index has got to be worth >> something, but is it worth a hack as ugly as this one? > Not doing it would be more ugly, unless there is some negative > side-effect? I thought some more about why this seems ugly to me, and realized that a lot of it has to do with the change in typalign. Currently, a compiler is entitled to assume that a pointer to Name is 4-byte aligned; thus for instance it could generate word-wide instructions for copying a Name from one place to another. A "Name" that is stored as just CSTRING might break that. We are already at risk of this, really, because of all the places where we gaily pass plain old C strings to syscache and index searches on Name columns. I think the only reason we've not been burnt is that it's hard to optimize strcmp() into word-wide operations. However the solution to that seems fairly obvious: let's downgrade Name to typalign 1 instead of 4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Heikki Linnakangas" on 24 Jun 2008 08:21 Shane Ambler wrote: > My question is whether this is limited to system catalogs? or will this > benefit char() index used on any table? The second would make it more > worthwhile. char(n) fields are already stored as variable-length on disk. This isn't applicable to them. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 24 Jun 2008 10:33 Teodor Sigaev <teodor(a)sigaev.ru> writes: >> dead easy to implement this: effectively, we just decree that the >> index column storage type for NAME is always CSTRING. Because the > Isn't it a reason to add STORAGE option of CREATE OPERATOR CLASS to BTree? as > it's done for GiST and GIN indexes. Hmm ... I don't see a point in exposing that as a user-level facility, unless you can point to other use-cases besides NAME. But it would be cute to implement the hack by changing the initial contents of pg_opclass instead of inserting code in the backend. I'll give that a try. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Next
|
Last
Pages: 1 2 Prev: Proposal of SE-PostgreSQL patches [try#2] Next: Dept of ugly hacks: eliminating padding spacein system indexes |