Author: Deepak Shenoy (CD
paper for the Annual Borland Conference, San Jose, 2003)
Download the source code for this article (Compressed ZIP file, 555 KB)
Abstract
Learn to build better and more responsive Web Applications using Delphi. What
we'll cover is:
-
Testing tools to test your web application's performance.
-
Caching Webmodule instances in ISAPI applications
-
Using the ISAPI Thread Pool
-
Database Engines and Web Application Performance
-
Using Database connection pools
-
Data compression and output buffering
-
ASP.NET optimization techniques
Introduction
Web applications are usually CGI, ISAPI or Apache modules that serve web
content. While the general concept of writing web applications is something we
are familiar with, the devil lies in the details. During development, our
energies are focussed on building the application, and we usually falter in
testing web applications for much higher loads, response time etc. This paper
focusses on helping you build applications that can handle higher loads and
respond faster.
Note: What we're going to cover in this paper is ISAPI DLL based web
applications. (There are ASP.NET optimizations too, further down) If you are
using CGI application, this content may not apply.
Testing tools and framework
To be able to test our web applications for higher loads, we need to use an
automated testing tool. I'm going to use
Microsoft's Web Application Stress Tool
(WAST).
We're going to test a simple application - the iserver application in the
Demos\Internet\WebServ\IIS folder within your Delphi installation directory.
Let's first compile it and place it under the scripts folder.
Once done, we will run WAST. WAST comes up with an initial screen like so:
Now hit Record. A web browser comes up, and you can navigate to the URL
of our server application, which in my machine is http://localhost/scripts/iserver.dll
.I've hit a couple of links and then gone back to WAST, and stopped recording.
Here's how my screen looks:
You'll notice that at the bottom you can see various entries for the pages I
visited. You can change any entry, delete it, add a new one etc.
Now go to the settings node on the tree on the left. You will notice a screen
like:
I've changed the settings as above. Basically this means that:
-
The script should run for 2 minutes
-
We are going to use 50 threads, each of which creates 2 socket connections to
the server.
-
We'll use a random delay of upto 5 seconds…this way all threads are not created
at once.
-
We're going to restrict bandwidth to 56 K - otherwise the service may appear
faster in our testing compared to real-life.
Let's then run this test. Click on the script name ("New Recorded Script") and
then select Run from the Scripts menu. The test will run for two minutes, and
will give us the results as a report. You can view reports by clicking Reports
in the View menu.
For the default application, compiled as is, the report looks like this:
Overview
==================================================================
Report name: 3/17/2003 5:11:12 PM
Run on: 3/17/2003 5:11:12 PM
Run length: 00:02:05
Web Application Stress Tool Version:1.1.293.1
Number of test clients: 1
Number of hits: 434
Requests per Second: 3.64
Socket Statistics
-----------------------------------------------------------------------
Socket Connects: 2157
Total Bytes Sent (in KB): 174.81
Bytes Sent Rate (in KB/s): 1.47
Total Bytes Recv (in KB): 434.40
Bytes Recv Rate (in KB/s): 3.65
Socket Errors
-----------------------------------------------------------------------
Connect: 0
sSend: 1657
Recv: 5
Timeouts: 0
Result Codes
Code Description Count
=======================================================================
200 OK 111
500 Internal Server Error 318
NA HTTP result code not given 5
Page Summary
Page Hits TTFB Avg TTLB Avg Auth
=======================================================================
GET /scripts/iserver.dll 74 3002.95 3023.19 No
GET /scripts/iserver.dll/custo 67 6085.73 6613.07 No
GET /scripts/iserver.dll/runqu 53 3134.34 3786.60 No
GET /scripts/iserver.dll/custo 56 5974.11 6536.77 No
GET /scripts/iserver.dll/runqu 24 3518.42 3518.75 No
GET /scripts/iserver.dll/custo 36 6345.33 6640.42 No
GET /scripts/iserver.dll/runqu 40 4295.88 4739.80 No
GET /scripts/iserver.dll/runqu 24 3191.42 3310.71 No
GET /scripts/iserver.dll/runqu 35 3922.29 3922.54 No
GET /scripts/iserver.dll/custo 25 1656.96 1657.32 No
Let's analyze this report.
The Number of Hits
is important: it gives you an indication of how many requests were sent, and
the Requests per second tells you the average density of requests.
Socket Errors
are obviously important, and in this case, we have 1657 send errors and 5
receive errors. While the errors could be in transmission, in this case it
might just be because the application was too busy to respond.
Result Codes
- 200 is "OK". 500 is "Internal Server Error" - maybe because the server
couldn't handle the load.
Page Summary
- In this it shows you the number of hits for each page, and the statistics of
Time To First Byte (TTFB) Average, along with the Time To Last Byte(TTLB)
Average. This can tell you how responsive the application is.
I'm going to show you only the report from now on, as we proceed to optimize
this application. Also I'll only show the important information. (No more
screenshots, that is)
Caching instances
When your web application is called (as an ISAPI, NSAPI or Apache module), the
Application spawns a new thread for a request. Within the context of this
thread, your main web module is created (in this case, the TCustomerInfoModule
in main.pas) When the request is handled and response is sent, the created
instance is then:
-
Freed, if you set Application.CacheConnections=False. (It's true by default)
-
If Cache Connections is true, then the webmodule is cached in an internal array
for reuse at the next request.
Now what if you have one request being handled when another request comes in?
The application will look inside the cache - if a webmodule is available, it is
used. Otherwise a new thread is created, with a new instance of a the webmodule
is created to handle the request. This happens until the number of currently
active connections reaches the value in Application.MaxConnections, after which
an exception is raised which says "Too many active connections". This shows up
as an Internal Server Error on the browser.
Now the default for Application.MaxConnections is 32
- so if you have any more than 32 connections active at any time, you will see
internal server errors. We seem to be doing much more in the test - so let's
see the results if we increase this value to 100.
Overview
-----------------------------------------------------------------------
Number of hits: 374
Requests per Second: 3.11
Socket Errors
-----------------------------------------------------------------------
Connect: 0
Send: 0
Recv: 0
Timeouts: 0
Result Codes
Code Description Count
=======================================================================
200 OK 216
500 Internal Server Error 158
Page Summary
Page Hits TTFB Avg TTLB Avg Auth
===================================================================
GET /scripts/iserver.dll 100 1825.33 1910.08 No
GET /scripts/iserver.dll/custo 97 2228.18 2229.77 No
GET /scripts/iserver.dll/runqu 88 2641.23 2664.10 No
GET /scripts/iserver.dll/custo 58 2801.14 2861.98 No
GET /scripts/iserver.dll/runqu 26 2587.65 2587.77 No
GET /scripts/iserver.dll/custo 4 980.50 980.50 No
GET /scripts/iserver.dll/runqu 1 3594.00 3594.00 No
GET /scripts/iserver.dll/runqu 0 0.00 0.00 No
GET /scripts/iserver.dll/runqu 0 0.00 0.00 No
GET /scripts/iserver.dll/custo 0 0.00 0.00 No
You'll notice that we've gained in terms of lower error rate, but we're still
seeing errors. You can do a trial and error to get to a value you can live
with. But let's try some more optimization methods.
The ISAPI Thread Pool
ISAPI applications can use their own thread pooling mechanism to maintain
application threads. Delphi provides its own unit, ISAPIThreadPool, for this -
all you have to do is add it to your uses clause, just below the line that
contains "ISAPIApp" in your .dpr file. I've now reduced the test time to one
minute, mainly because it took too long to run and have to wait.
Note for Delphi 6 users: Steve Trefethen has provided an updated version of this unit
for D6 users. The original unit does nothing spectacular.
Results:
Overview
=======================================================================
Number of hits: 751
Requests per Second: 12.50
Result Codes
Code Description Count
=======================================================================
200 OK 560
500 Internal Server Error 191
Page Summary
Page Hits TTFB Avg TTLB Avg Auth
=======================================================================
GET /scripts/iserver.dll 108 2406.58 2409.21 No
GET /scripts/iserver.dll/custo 102 3232.81 3233.85 No
GET /scripts/iserver.dll/runqu 102 5705.95 5708.00 No
GET /scripts/iserver.dll/custo 100 7390.03 7391.45 No
GET /scripts/iserver.dll/runqu 97 7846.37 7847.36 No
GET /scripts/iserver.dll/custo 84 6089.39 6091.00 No
GET /scripts/iserver.dll/runqu 70 3101.99 3102.36 No
GET /scripts/iserver.dll/runqu 44 1688.95 1689.32 No
GET /scripts/iserver.dll/runqu 26 680.85 681.23 No
GET /scripts/iserver.dll/custo 18 314.83 315.28 No
Notice that we could handle a lot more connections in one minute! (I got an
average of around 350 per minute, which is still higher than the 175 per minute
average I saw on the earlier application)
We're doing slightly better on internal server errors, and if you look closely
in the reports section, you'll notice the errors are all on the database pages.
This might be related to contention or locks, and the DBDEMOS database being in
paradox might result in other problems too.
Conversion to a better database engine
Let's move the DBDEMOS database to an Interbase database and retest. I've also
converted to using IBX components instead (since the BDE isn't very reliable in
multithreaded apps)
Overview
=======================================================================
Number of hits: 371
Requests per Second: 6.17
Socket Statistics
-----------------------------------------------------------------------
Socket Connects: 1145
Total Bytes Sent (in KB): 148.00
Bytes Sent Rate (in KB/s): 2.46
Total Bytes Recv (in KB): 854.50
Bytes Recv Rate (in KB/s): 14.22
Result Codes
Code Description Count
=======================================================================
200 OK 363
NA HTTP result code not given 8
Page Summary
Page Hits TTFB Avg TTLB Avg Auth
=======================================================================
GET /scripts/iserver.dll 89 3899.07 3901.65 No
GET /scripts/iserver.dll/custo 49 6327.51 6330.12 No
GET /scripts/iserver.dll/runqu 54 7037.46 7038.59 No
GET /scripts/iserver.dll/custo 40 8859.60 8863.73 No
GET /scripts/iserver.dll/runqu 28 11089.82 11090.32 No
GET /scripts/iserver.dll/custo 20 7692.70 7696.75 No
GET /scripts/iserver.dll/runqu 27 8359.89 8360.96 No
GET /scripts/iserver.dll/runqu 21 9874.52 9875.48 No
GET /scripts/iserver.dll/runqu 22 6795.91 6797.86 No
GET /scripts/iserver.dll/custo 21 6338.81 6341.81 No
This is much better - no Internal server errors! You can now try to see what's
causing all that delay. We're getting an average of around 4 - 11 seconds per
page, which is not all that great. Note that the development machine I use
isn't well configured for web pages - it has IDE hard drives, 133 Mhz memory
(512 MB) and a single processor PIII 667 Mhz. Today's machines are much faster
and memory buses are faster too, so you'll see much better performance on a
higher scale. Also the bandwidth is restricted to 56 K in testing, which is
less than average.
Testing other parts of the framework
We've only tested two pages in the application - there are others that use blob
fields etc. Let's build a test plan for these pages and check.
Overview
=======================================================================
Number of hits: 504
Requests per Second: 8.38
Result Codes
Code Description Count
=======================================================================
200 OK 504
Page Summary
Page Hits TTFB Avg TTLB Avg Auth
=======================================================================
GET /scripts/iserver.dll 80 1189.72 1228.75 No
GET /scripts/iserver.dll/custo 80 6942.93 6944.74 No
GET /scripts/iserver.dll/runqu 80 11354.19 11356.20 No
GET /scripts/iserver.dll/emplo 80 10721.46 10724.09 No
GET /scripts/iserver.dll/bioli 80 8370.70 8373.29 No
GET /scripts/iserver.dll/getim 74 6645.24 6673.00 No
GET /scripts/iserver.dll/runqu 28 6723.64 6724.79 No
GET /scripts/iserver.dll/getim 2 6382.50 6407.50 No
This is only slightly better - I've used a different thread pooling unit from
http://www.delphi3000.com/articles/article_1693.asp that seems to give
a better performance than the Borland unit.
Using a database connection pool
Right now, we have a database connection on each webmodule, which isn't very
easy on memory usage. In this case you have 250 cached web modules, which means
250 connections to the database. If you don't want this, you can create a
database connection pool, say of around 50 connections. This will impact
performance but the service will be more reliable since it won't take so much
memory and resources.
Note that if you use ADO, you don't have to do this - there is built-in
connection pooling in ADO.
I've created a resource pool datamodule as a non-web datamodule - this will be
a singleton instance. Before you open any query, you must get a connection from
the pool. The pool will grow to a max. size of 50, and maintains a list of
active and inactive connections. Each connection once created is never freed -
it only adds to the pool, and once inactive will be assigned to the next
request. If the pool is full when a request comes in, the system waits for a
preassigned amount of time (say 10 seconds, which is the highest time that it
takes to serve a page as per results above) and if there's still no connection
available, it flags an error that the Server is too busy.
Overview
================================================================================
Report name: 9/6/2003 5:16:34 PM
Run on: 9/6/2003 5:16:34 PM
Run length: 00:01:17
Web Application Stress Tool Version:1.1.293.1
Number of test clients: 1
Number of hits: 770
Requests per Second: 10.25
Result Codes
Code Description Count
================================================================================
200 OK 770
Page Summary
Page Hits TTFB Avg TTLB Avg Auth Query
================================================================================
GET /scripts/iserver.dll 158 2548.14 2555.13 No No
GET /scripts/iserver.dll/custo 129 6387.86 6399.35 No No
GET /scripts/iserver.dll/runqu 83 11023.45 11024.47 No No
GET /scripts/iserver.dll/emplo 81 9416.53 9421.02 No No
GET /scripts/iserver.dll/bioli 80 4574.57 4576.85 No No
GET /scripts/iserver.dll/getim 80 3991.64 4011.29 No No
GET /scripts/iserver.dll/runqu 80 9689.76 9692.16 No No
GET /scripts/iserver.dll/getim 79 6849.80 6864.00 No No
This test has been run for 15 more seconds but you can see the performance
improvement in terms of requests per second, timing and number of hits handled.
Techniques for optimization within the VCL
-
String searching
The PathInfo received is whatever follows the URL (for instance, in
http://test.com/search, "/search" is the PathInfo). Delphi ISAPI Web Modules
have a TCollection that stores a list of valid paths that have handlers.
If your web application has a large number of PathInfo handlers, Delphi will search
linearly through the list of TCollection items to find the right
handler. You can optimize this by writing an inherited WebModule that will use
something like Binary Search, and keep the TCollection sorted. This can give
you a big improvement.
-
Tag handling in responses
The TPageProducer and its descendants provide a way to handle custom tags such
as <#TAGNAME>. You can see how this works in the HttpProd unit, in
the ContentFromStream handler. This uses a token based approach - so it
actually parses the entire HTML script! You can definitely optimize this by
changing this to a different mechanism for string searching. There are a number
of fast string search and replace utilities which will make your application
that much faster. Plus, you can write content using your own handlers instead
of using TPageProducer or TDatasetPageProducer. While this may not be visual,
it definitely can improve performance.
Here are a few string search routines:
-
FastStrings: A free string routine library with source.
Postcardware.
-
Hyperstring: A commercial string utilities library
Changes in Internet Server Manager
You can change the properties of a web site to handle larger number of clients.
Bring up
Internet Services Manager from Administrative Tools, and
you'll see what you can change. If you open the Web Site properties, it has a
performance
tab, in which you can:
-
Change performance parameters
- In the performance tab, you can set operational features for the no. of
visits you expect.
-
You will probably need to maintain logs, but logging takes that much extra
time. Keep your log information as consise as possible, and
make sure log files don't grow too big on the server.
-
In your virtual directory where you store your ISAPI, you can set Application
Protection to Low (IIS Process), Medium (Pooled), or High (Isolated). This
generally tells you which context the DLL gets loaded in - Low gets your
DLL into the IIS Process Space, High
loads the ISAPI into a different (COM+) application. Obviously Low will have
the best performance, and High the worst. But the trade-off is reliability - a
failed IIS process can crash the web server.
-
Application Configuration
- In the "App Options" dialog box, uncheck "Enable Session State". This is only
for ASP applications, you should disable it for your ISAPI applications.
Data Compression
Read David Intersimone's
excellent article about
compressing data before sending.
Remember that this
causes extra server load
since it has to now compress the packet before sending, so you must use
compression only for pages that involve transfer of large amounts of data.
Note that compression is a CPU intensive task, and therefore you have to
analyze where your bottleneck is - Compression may not solve your problem if
your bottleneck is CPU usage, but it might make a difference if the issue
really is a limited amount of bandwidth.
Buffering Output
Usually a user notices better performance if the web browser
receives content
faster - so instead of buffering all output before sending it
back, you can send back at least the HTML tags that constitute the header and
title, so that the user gets to see activity immediately. You can flush the
response by using
Response.SendStream.
Database schema optimization
For web applications that connect to databases and run queries, you will see a
performance gain by optimizing the database itself. Adding indexes, views and
using plans will result in shorter query time, which will decrease the total
time it takes to process a web request.
How do you do this? Some pointers:
-
Ensure all tables have primary keys.
Single field primary keys are best in my opinion - saves you the trouble of
maintenance, but whatever you choose, make sure the table has a primary key
defined. This improves location performance.
-
When you have a link between two tables (usually a field in one table links to
the other table's primary key) - use a foreign key, and
perhaps an index on the foreign key depending on how often the link is used in
your application SQL.
-
I assume you're using an SQL database, and most SQL databases support SQL
"plans". This gives you input about how your SQL is parsed by the
database and what indexes, keys etc. the database uses to merge data. Optimally
a plan should use the maximum number of indexes and avoid having to merge
intermediate results etc. You can tweak the SQL in that manner, and in some
database engines you can even specify the plan (for instance if you are only
expecting to show a few rows, you can use a plan that will return only the top
few rows) I can't give you examples here - it's out of the scope of this paper.
-
Use views
to return complex SQL based data - this method not only reduces SQL parsing
time, it also ensures a pre-optimized data view for you. Most database engines
will pre-compile and pre-optimize a view so repeated view access is faster.
Further optimizations
There are more optimizations you can do which will make your applications
faster.
-
Hardware: Better hardware is more scalable - a faster bus, faster
hard disk, larger caches, multiple processors etc.
-
Server farms
- you can use multiple server machines when a larger number of requests come in
- for this you might need a load balancing server which is available from
multiple vendors. Remember that you shouldn't store state in local variables or
files - all state should go into databases since the next request may be
handled from another machine.
-
Data Farms: What if your bottleneck is the database engine itself?
Then, you can have database "farms" - multiple data servers that will host the
same database engine and the same data. You can synchronize your data across
these servers by using what's known as "merge replication". Many database
servers including Oracle and SQL server support this feature, and you can even
set these databases up for scheduled merges every few hours.
-
Fast Internet Connections: IIS depends quite heavily on the speed of
your connection, and increasing speed or optimizing your network connection can
make a difference to your application's response time.
-
TCP/IP and IIS Optimization: Some very useful tips are available from
this site.
Optimization techniques for ASP.NET
If you're using ASP.NET with C#Builder or Delphi for .NET, there are some
changes you can make to ensure your application is scalable. Some links are
important to consider
I'll talk about some of these techniques below.
1. Disable session State
If you need to identify which user is currently requesting your page, you'd
want to use the ASP.NET session state technology. But many pages you
write will not need this functionality and therefore, session state is of no
importance to these pages. There is a downside to having session state - and
that is server side performance. The server will try to "identify" the current
user when session state is enabled. Unfortunately session state is enabled by
default, so you can disable it for a page using the @Page directive in
your ASP.NET page.
<%@ Page EnableSessionState="false" %>
What if you have to use session state? In ASP.NET you can choose to have
session state stored using an in-process server or an out-of-process server,
and even in a database. Obviously perfomance suffers as you go out of process,
and further lower when you choose the database approach. But these latter
options have upsides too (reliability and redundancy) so choose your option
carefully.
2. Use Page.IsPostBack() for round trips
Your page may have a ton of controls but you may have to populate these
controls only the first time from a database. Now on every control's "event"
you will find that the data is sent back to the page as the event handler runs
on the server. But you don't need to populate the controls each time! (Perhaps
you'll just redirect to another page) You can figure out if this is a
"postback" call, i.e. if data on the page has been posted back to the server -
by called Page.IsPostBack() which returns true if this is a postback call. That
could save you a ton of time when processing postbacks.
3. Optimize the use of "ViewState".
If you look at most ASP.NET pages, you'll see something like this :
<input type="hidden" name="__VIEWSTATE" value="dDwzNjA1NzEwMDg
7dDw7bDxpPDA+O2k8MT47PjtsPHQ8O2w8aTwxPjtpPDI+Oz47b
Dx0PDtsPGk8MD47PjtsPHQ8QDxHb3REb3ROZXQ6IFRoZSBNaWN
yb3NvZnQgLk5FVCBGcmFtZXdvcmsgQ29tbXVuaXR5O1xlO1xlO
z47Oz47Pj47dDxAPFxlO1xlOz47Oz47Pj47dDw7bDxpPDA+O2k
8ND47aTw2Pjs+O2w8dDw7bDxpPDE+O2k8Mz47aTw3Pjs+O2w8d
Dw7bDxpPDA+Oz47bDx0PDtsPGk8Mz47PjtsPHQ8O2w8aTwxPjt
pPDM+O2k8NT47aTw2Pjs+O2w8dDxw
...
If your page doesn't have controls then you don't need this ViewState variable
at all! Pages that are purely informative or reports are just wasting space on
this ViewState variable. ANd this is especially true for DataGrids which have a
lot in the ViewState. You can disable ViewState on a per-control basis by using
the "EnableViewState" attribute.
<asp:datagrid EnableViewState="false" ... runat="server"/>
Or, you can use this on the entire page:
<@Page EnableViewState="false"/>
4. Pre-compile your web page.
The first time a web application is run, or if a file has changed, the ASP.NET
runtime recompiles the ASP.NET files. This can take a while, so after you
upload changes to your site, load your browser up requesting a page from each
directory on your server. This ensures all the pages are "pre-compiled" so that
the next user that requests a page doesn't have to wait for compiles.
5. Use .NET managed components instead of COM components
If you're used to ASP, then you might be using a number of COM components in
your ASP code. Porting this to .NET might make you believe that you can reuse
those COM components through COM Interop. But this has a significant effect on
performance. Most commonly used COM components have .NET managed equivalents,
and you can use them instead. For instance, instead of using
Scripting.FileSystemObject, you can convert to using the classes in the
System.IO namespace. If you have written Delphi based COM components, consider
converting them to managed code using Delphi for .NET for improved performance.
6. Make optimum use of caching.
In ASP.NET there's multiple caching methods available:
-
You can cache an entire page.
-
You can cache the output of a control (if it's not something that changes
often).
-
You can cache application data or user data.
Caching significantly improves performance, especially when data is static and
you don't expect it to change. For instances lookup data in tables might not
change at all - in such cases simply cache this on the server so you don't have
to make a costly connection to the database.
See
a huge set of links
on caching in .NET.
7. Tune your "web.config" files.
The web.config files for your web site contains many juicy bits of information
that you can use to optimize your site. You can disable NTLM authentication
(improves performance), change isolation patterns for your application etc.
Read the MSDN documentation for what options are available and what might apply
to you.
Some other links:
Conclusion
You can write ISAPI applications in Delphi, and there is scope for a lot of
optimization. I hope this paper has helped you identify potential scalability
problems in your application, and given you enough information to test for such
problems. There are of course innumerable things you can do specific to your
application - you might get better performance using different SQL in your
queries, you might see a vast improvement by using a specific version of thread
pooling etc. In general, web applications can be optimized to a large extent by
changing the way you program web applications, and you must build such
applications with one eye on scalability.