<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    
    <title>Magnus Hagander's PostgreSQL blog (Entries tagged as window aggregates)</title>
    <link>http://blog.hagander.net/</link>
    <description></description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.6.2 - http://www.s9y.org/</generator>
    
    

<item>
    <title>Finding gaps in partitioned sequences</title>
    <link>http://blog.hagander.net/archives/203-Finding-gaps-in-partitioned-sequences.html</link>
            <category>PostgreSQL</category>
    
    <comments>http://blog.hagander.net/archives/203-Finding-gaps-in-partitioned-sequences.html#comments</comments>
    <wfw:comment>http://blog.hagander.net/wfwcomment.php?cid=203</wfw:comment>

    <slash:comments>4</slash:comments>
    <wfw:commentRss>http://blog.hagander.net/rss.php?version=2.0&amp;type=comments&amp;cid=203</wfw:commentRss>
    

    <author>nospam@example.com (Magnus Hagander)</author>
    <content:encoded>
    &lt;p&gt;There are an almost unlimited number of articles on the web about how to find gaps in sequences in SQL. And it doesn&#039;t have to be very hard. Doing it in a &quot;partitioned sequence&quot; makes it a bit harder, but still not very hard. But when I turned to a window aggregate to do that, I was immediately told &quot;hey, that&#039;s a good example of a window aggregate to solve your daily chores, you should blog about that&quot;. So here we go - yet another example of finding a gap in a sequence using SQL.&lt;/p&gt;

&lt;p&gt;I have a database that is very simply structured - it&#039;s got a primary key made out of &lt;i&gt;(groupid, year, month, seq)&lt;/i&gt;, all integers. On top of that it has a couple of largish text fields and an fti field for full text search. (Initiated people will know right away which database this is). The sequence in the seq column resets to zero for each combination of &lt;i&gt;(groupid, year, month)&lt;/i&gt;. And I wanted to find out where there were gaps in it, and how big they were, to debug the tool that wrote the data into the database. This is really easy with a window aggregate:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;&lt;div class=&quot;geshi&quot; style=&quot;text-align: left&quot;&gt;&lt;br /&gt;&lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;SELECT&lt;/span&gt; * &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;FROM&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160;&lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;SELECT&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; groupid,&lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; year,&lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; month,&lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; seq, &lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; seq-lag&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;seq,&lt;span style=&quot;color: #cc66cc;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; OVER &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;PARTITION &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;BY&lt;/span&gt; groupid, year, month &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;ORDER&lt;/span&gt; &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;BY&lt;/span&gt; seq&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;AS&lt;/span&gt; gap &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;FROM&lt;/span&gt; mytable&lt;br /&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;AS&lt;/span&gt; t&lt;br /&gt;&lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;WHERE&lt;/span&gt; &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;NOT&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;t.gap=&lt;span style=&quot;color: #cc66cc;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;ORDER&lt;/span&gt; &lt;span style=&quot;color: #993333; font-weight: bold;&quot;&gt;BY&lt;/span&gt; groupid, year, month, seq&lt;br /&gt;&amp;#160;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One advantage to using a window aggregate for this is that we actually get the whole row back, and not just the primary key - so it&#039;s easy enough to include all the data you need to figure something out.&lt;/p&gt;

&lt;p&gt;What about performance? I don&#039;t really have a big database to test this on, so I can&#039;t say for sure. It&#039;s going to be a sequential scan, since I look at the &lt;i&gt;whole&lt;/i&gt; table,and not just parts of it. It takes about 4 seconds to run over a table of about a million rows, 2.7Gb, on a modest VM with no actual I/O capacity to speak of and a very limited amount of memory, returning about 100 rows. It&#039;s certainly by far fast enough for me in this case.&lt;/p&gt;

&lt;p&gt;And as a bonus, it found me two bugs in the loading script and at least one bug in somebody elses code that I&#039;m now waiting on to get fixed...&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Fri, 27 Jan 2012 16:53:52 +0000</pubDate>
    <guid isPermaLink="false">http://blog.hagander.net/archives/203-guid.html</guid>
    <category>postgresql</category>
<category>sql</category>
<category>window aggregates</category>

</item>
<item>
    <title>What's your favorite 8.4 feature?</title>
    <link>http://blog.hagander.net/archives/141-Whats-your-favorite-8.4-feature.html</link>
            <category>PostgreSQL</category>
    
    <comments>http://blog.hagander.net/archives/141-Whats-your-favorite-8.4-feature.html#comments</comments>
    <wfw:comment>http://blog.hagander.net/wfwcomment.php?cid=141</wfw:comment>

    <slash:comments>9</slash:comments>
    <wfw:commentRss>http://blog.hagander.net/rss.php?version=2.0&amp;type=comments&amp;cid=141</wfw:commentRss>
    

    <author>nospam@example.com (Magnus Hagander)</author>
    <content:encoded>
    &lt;p&gt;Over the past month or so, I have been informally polling those of our customers I&#039;ve been meeting with about which features in the upcoming 8.4 version of PostgreSQL they are most excited about, and most likely to have use for in the short-to-medium term. Part of the results were exactly what I expected, other parts were a bit more surprising.&lt;/p&gt;

&lt;p&gt;First of all, the one feature that &lt;i&gt;every single one&lt;/i&gt; said they were very much looking forward to is the updated &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.chesnok.com/daily/2009/02/10/fsm-visibility-map-and-new-vacuum-awesomeness/&#039;]);&quot;  href=&quot;http://www.chesnok.com/daily/2009/02/10/fsm-visibility-map-and-new-vacuum-awesomeness/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Free Space Map&lt;/a&gt; &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.depesz.com/index.php/2008/10/04/waiting-for-84-new-fsm-free-space-map/&#039;]);&quot;  href=&quot;http://www.depesz.com/index.php/2008/10/04/waiting-for-84-new-fsm-free-space-map/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;implementation&lt;/a&gt;, which will remove two very annoying and difficult-to-get-right configuration parameters (&lt;i&gt;max_fsm_pages&lt;/i&gt; and &lt;i&gt;max_fsm_relations&lt;/i&gt;). So kudos to Heikki for that. Closely related to this feature is the &lt;a href=&quot;http://blog.hagander.net/archives/129-Visibility-map-arrives.html&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Visibility Map&lt;/a&gt;, and it&#039;s reduction of &lt;i&gt;VACUUM&lt;/i&gt; requirements. This is, however, not something a lot of our customers will have a direct use for - other than one or two, they don&#039;t actually have any issues with &lt;i&gt;VACUUM&lt;/i&gt; (once they learned to remove &lt;i&gt;VACUUM FULL&lt;/i&gt; from the nightly cronjobs).&lt;/p&gt;

&lt;p&gt;When looking closer at the SQL level functionality that people have been excited above, I expected to see &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.depesz.com/index.php/2008/10/07/waiting-for-84-common-table-expressions-with-queries/&#039;]);&quot;  href=&quot;http://www.depesz.com/index.php/2008/10/07/waiting-for-84-common-table-expressions-with-queries/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Common Table Expressions&lt;/a&gt; (CTEs), AKA &lt;i&gt;Recursive Queries&lt;/i&gt; at the top of the list, based on requests on our lists in the future. However, it seems that at least to our customers, &lt;a onclick=&quot;_gaq.push([&#039;_trackPageview&#039;, &#039;/extlink/www.depesz.com/index.php/2009/01/21/waiting-for-84-window-functions/&#039;]);&quot;  href=&quot;http://www.depesz.com/index.php/2009/01/21/waiting-for-84-window-functions/&quot; onclick=&quot;window.open(this.href, &#039;_blank&#039;); return false;&quot;&gt;Window Aggregates&lt;/a&gt; are a lot more interesting than &lt;i&gt;CTE&lt;/i&gt;s. I&#039;m sure there are a lot of use-cases for recursive queries - there just seem to be even more of them for window aggregates.&lt;/p&gt;

&lt;p&gt;Parallel restoring of dumps is also fairly high on peoples list, though &lt;i&gt;most&lt;/i&gt; of the people that can really use it are deploying different ways to deal with it already. But they will definitely be using it once it&#039;s there. As well as the speedups of the PITR recovery and warm standby.&lt;/p&gt;

&lt;p&gt;There are a lot of other features in the upcoming version as well. Some are small, some are large. There are still some smaller features (for example \ef  in psql) that I keep missing a lot every time I end up working on an earlier version. They&#039;re all (well, most of them) great, but maybe not so obvious at first.&lt;/p&gt;

&lt;p&gt;What&#039;s your favorite 8.4 feature?&lt;/p&gt;

 
    </content:encoded>

    <pubDate>Tue, 14 Apr 2009 19:39:00 +0000</pubDate>
    <guid isPermaLink="false">http://blog.hagander.net/archives/141-guid.html</guid>
    <category>8.4</category>
<category>cte</category>
<category>fsm</category>
<category>postgresql</category>
<category>window aggregates</category>

</item>

</channel>
</rss>