Skip to Content

Ben Thompson: Informed Mix

Syndicate content Informed Mix
Random postings about IBM Informix
Updated: 1 hour 45 min ago

When do my stored procedure execution plans get updated?

Fri, 2017-06-23 19:02

For the sake of brevity in this article I am going to group procedures, functions and routines together as stored procedures and ignore any differences between them.

What does the SQL command UPDATE STATISTICS FOR PROCEDURE/FUNCTION/ROUTINE does and perhaps, more pertinently, as a DBA do I need to run this regularly to ensure my systems are working efficiently? For those wanting an immediate answer I think it is “never” or “almost never“, the reasons for which I hope to explain clearly in this article.

The command itself is straightforward: calling it causes Informix to parse a stored procedure’s code and produce a query plan for all of it based on the current statistics and data distributions (if the procedure references any tables). It then writes the query plan to the sysprocplan table which is, unless you have an unlogged database, a logged operation written to the logical log. Used with no parameters it does this for all stored procedures in the system.

As long as the query plan in sysprocplan is reasonably efficient there is probably no need to (ever) proactively update it but there may be cases when you’d want to do so, for example, if a very small or empty table has grown into a large one. However if you were to do this your new plan would be based on the current table statistics and data distributions and if these haven’t been updated yet you may get the same, now inefficient, plan.

The manual states:

The sysprocplan system catalog table stores execution plans for SPL routines. Two actions can update the sysprocplan system catalog table:

  • Execution of an SPL routine that uses a modified table
  • The UPDATE STATISTICS FOR ROUTINE, FUNCTION, or PROCEDURE statement.

There is a created column in the sysprocplan table but it’s a date and not a date/time which makes it much harder to match plan updates to other events.

So what is a modified table? Quite simply it is one where the version number has been incremented. You can see the version number with an SQL query like:

select version from systables where owner='myowner' and tabname='mytable';

I think the only reference to this in manual is in section about the systables view where it simply says:

version
INTEGER
Number that changes when table is altered

How the engine works out the dependencies a stored procedure has on different tables falls into the category of system internals, which IBM chooses not to publicly document, but I think it’s safe to say that if a table is referenced anywhere in a procedure it is dependent on it.

There are many ways a table can be “altered”, some more obvious than others:

Method Version number incremented by GRANT SELECT 1 GRANT UPDATE 1 GRANT SELECT 1 GRANT SELECT, UPDATE 1 UPDATE STATISTICS HIGH 2 UPDATE STATISTICS MEDIUM 2 UPDATE STATISTICS [LOW] 2 RENAME COLUMN 65536 CREATE INDEX 65536 ADD column 131072 DROP column 131072 GRANT CONNECT 0 GRANT RESOURCE 0 GRANT DBA 1

I am not sure why some operations increment the value by large numbers, all powers of 2, as any increment has a similar effect, at least as far as the scope of this article is concerned.

The table is not a complete list because there are many possible DDL operations but this does already illustrate or suggest that:

  • On most systems it’s likely that UPDATE STATISTICS commands will be the main trigger for stored query plans to be updated. If you run LOW, HIGH and MEDIUM modes for a table like you will if you use AUS or dostats, you’ll trigger at least three updates for dependent stored procedures.
  • If we want to grant multiple privileges on the same table, it’s best to do it in a single statement because if a dependent stored procedure is being called in between running commands by an application, its stored execution plan will be updated only once.
  • GRANT DBA is not a table level operation yet it has an effect.

Further testing shows that both the GRANT DBA and REVOKE DBA statements increment the version number on all tables in the same database where the tabid is 100 or greater, that is all user tables. From the manual above it follows that the stored query plans for all stored procedures or functions dependent on a table will be updated the next time they are executed.

On our systems we see a large amount of writes to sysprocplan after granting or revoking the DBA privilege to anyone. When graphed we see a sharp peak and a long exponential tail off as less commonly used procedures get called.

Therefore if you grant DBA to a user on a busy live system, it can affect concurrency more than you might expect. On an idle system you may want to run UPDATE STATISTICS FOR PROCEDURE immediately afterwards to update the stored query plans in an orderly way and save the first session to call any given procedure from this overhead.

I think running the command offline to avoid the overhead for user or application sessions is possibly the only true use case for this command.


Improving remote query performance by tuning FET_BUF_SIZE

Mon, 2017-04-03 15:12

I thought I’d write blog post as a nice example of where tuning the client-side variable, FET_BUF_SIZE, really speeded up a remote query.

FET_BUF_SIZE is documented by IBM in the context of a Java application using JDBC here and as a server environment variable here.

One thing the documentation warns about is that simply setting this to a high value may degrade performance, especially if you have a lot of connections. With that in mind here are some facts about the query I’m running and using as a basis for these tests:

  • I am just using a single connection to the database.
  • the query returns around 10000 rows and 60 Mb of data.
  • the client and the server are geographically separated from each other and Art Kagel’s dbping utility typically takes around 0.1 seconds to connect remotely; this compares with around 3 milliseconds locally.
  • crucially the query runs in seconds locally on the server but takes over three minutes when run remotely.

If I begin running the query with the default value of FET_BUF_SIZE and monitor waits on the server, I can see that reads only go up slowly and that my session is waiting on a condition (indicated by the Y in position one of column two) more or less all the time:

> while [ 1 ] ; do
> onstat -u | grep thompson
> sleep 1
> done
Userthreads
address flags sessid user tty wait tout locks nreads nwrites
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 552 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 552 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 560 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 560 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 568 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 576 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 592 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 624 0
26eb492d18 Y--P-R- 76228 thompson 0 26e67cd298 0 0 624 0

The sixth column shows the rstcb value of the thread I’m waiting on. I can use onstat -g con (print conditions with waiters) to see that I’m waiting on the network:

> onstat -g con | grep -E '^cid|26e67cd298'
cid addr name waiter waittime
5789 26e67cd298 netnorm 84353 0

A quick check with onstat -g ses 76228 shows that thread id. 84353 does indeed correspond to my session.

While the wait time shown above is not increasing it’s a different story when we look at netstat, again on the server:

> netstat -nc | grep '172.16.0.1'
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 1312 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1284 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1306 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1302 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1194 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1206 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1266 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1304 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1318 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED
tcp 0 1248 10.0.0.1:9088 172.16.0.1:37004 ESTABLISHED

What the above is showing us is that there are consistently around 1200 to 1300 bytes in the send queue (Send-Q). This is surely our bottleneck.

At this point when investigating the problem I considered modifying other parameters such as OPTOFC and Linux kernel parameters. However with a few moment’s thought it was clear these weren’t going to gain anything: OPTOFC optimises the open-fetch-close sequence and for a single long running query this is not going to give us anything measurable; and an investigation into increasing the Linux kernel parameter related to the send queue size was dismissed when we found that 1300 bytes was well below the maximum allowed.

In Informix 11.50 the maximum value of FET_BUF_SIZE is 32767 (32 kb) but this is increased to 2147483648, or as we’ll see actually 2147483647, (2 Gb) in 11.70 and above. We can therefore move onto to experiment with different values:

FET_BUF_SIZE Query run time (s) Average Send-Q size over 10 samples Maximum Send-Q size observed Default 221.2 1274 1332 1024 221.1 1255 1326 2048 221.1 1285 1338 4096 221.2 1297 1360 6144 102.1 2564 2676 8192 56.6 5031 5210 16384 22.6 12490 13054 32767 (max. 11.50 value) 11.5 24665 29968 65536 7.0 62188 62612 131072 4.9 115793 127826 262144 4.0 146686 237568 524288 3.5 184320 249856 1048576 3.3 245760 473616 2097152 3.2 249856 486352 2147483647 (max. value – 1) 3.0 245760 549352 2147483648 (supposed max. value) 221.3 1276 1366

As the run times get shorter it gets tricky to measure the Send-Q using netstat -nc: it can be sampled very frequently using a command like:

while [ 1 ] ; do
netstat -n | grep '172.16.0.1'
done

This will produce many measurements per second and with this it’s possible to see it fill up and drain several times in the period while the statement is running.

It’s also interesting to play around with the boundaries. For example, with a FET_BUF_SIZE between around 5500 and 5600 maximum Send-Q sizes the same as those consistently achieved with a FET_BUF_SIZE of 6144 begin to creep into the results but many measurements remain around the values consistently measured wit a FET_BUF_SIZE of 4096:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 1316 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1318 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1278 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1352 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1288 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 2546 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1278 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 2502 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1266 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1314 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 2506 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED
tcp 0 1292 10.0.0.1:9088 172.16.0.1:37488 ESTABLISHED

So what are the conclusions?

  • Increasing FET_BUF_SIZE at the client side can dramatically improve the speed of remote queries.
  • Maximum Send-Q sizes, as measured by netstat, increase in discrete steps as FET_BUF_SIZE is increased.
  • A larger Send-Q allows more data to be cached and reduces waits seen in Informix.
  • To see any improvement at all FET_BUF_SIZE must be increased to at least 6000 (approximate value).
  • Around boundaries between maximum Send-Q sizes there appears to be a cross-over region where maximum send queue sizes overlap from two adjacent values are seen from one second to the next.
  • The maximum value allowed in 11.70 at least is 2147483647 and not 2147483648, as indicated in the documentation.
  • The maximum 11.50 value of 32767 produced a run time nearly 4x slower than an optimised value for 11.70+
  • Other testing I did, not documented here, shows that the results are uniform across JDBC and ESQL/C applications.

Note: all user names, IP addresses and port numbers used in this post have been altered.


Copyright