OpenSolaris

Printable Version Enter a New Search
Bug ID 6540060
Synopsis race in pkcs#11 engine in multithreaded environment
State 10-Fix Delivered (Fix available in build)
Category:Subcategory solaris-crypto:openssl
Keywords bad | engine | handshake | kt-scalability | mac | openssl | pkcs#11
Responsible Engineer Jan Pechanec
Reported Against snv_59
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_66
Fixed In snv_66
Release Fixed solaris_nevada(snv_66) , solaris_10u6(s10u6_05) (Bug ID:2149617)
Related Bugs 6375348 , 6554248 , 6558630 , 2151824 , 6593176 , 6605538 , 6672815 , 6715982
Submit Date 28-March-2007
Last Update Date 24-June-2008
Description
With Apache 2.2.4 built in Worker mode (MPM-Worker), if we enable pkcs11 engine,
 we are getting Bad Record Mac errors on clients.

SSL_connect error: error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad r
ecord mac

Clients also complained the following:
SSL_ERROR_SSL: status = -1, error = 1, error:1409E0E5:SSL routines:SSL3_WRITE_BY
TES:ssl handshake failure

This was using a web ssl workload -
The problem can be reproduced at a very low load of 100 load generators.

This happens only with Apache Worker Model. I have tried to configure Apache to
have only one thread per process, there is no problem. As soon as I have 2 or mo
re threads per httpd process, the errors show up. It's not an issue when I have
single client thread (i.e. all the requests are serialized).

This problem is not there under the following case:

- Apache worker model with pkcs11 engine enabled and each Aapche proc handling
  a single connection
- Apache prefork model (i.e) runs perfectly with the pkcs#11 engine

I have tried a few Apache versions (all built with Worker model). All have the s
ame issue. OpenSSL version used to built the Apache was that in S10, i.e. libssl
.so.0.9.7.
it's quite probable that it's not relevant to ncp(7) only. However, I simulate it on Niagara now anyway.

for the record, Apache worker configuration is like this:

<IfModule worker.c>
ListenBacklog        512
ServerLimit          64
#ThreadLimit         64
MaxClients           480
StartServers         32
ThreadsPerChild      12
#MinSpareThreads      16384
#MaxSpareThreads      16384
MaxRequestsPerChild  0
</IfModule>

apache 2.2.4 was built like this:

./configure --prefix=/export/apps --enable-ssl --with-ssl=/usr/sfw --with-mpm=worker

and as a client, I use http_load using a small 'run' script that looks like this:

[ $# -ne 5 ] && \
    echo "$0 <urlfile> <n_parallels> <num_of_cycles> <n_of_fetches> <cipher>" && exit

# for SSL
for i in `yes | head -$3`; do
        echo "starting"
        ./http_load -cipher $5 -parallel $2 -fetch $4 $1 &
done

starting it like this:

./run url-ssl 10 10 100 fastsec

with url-ssl file containing this:

https://ogma.czech:443

When starting the 'run' script, almost immediatelly I can see error message(s) about failed SSL handshakes:

9167:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac:../../../../common/openssl/ssl/s3_pkt.c:1057:SSL alert number 20
./http_load: ran out of connection slots
1 fetches, 9 max parallel, 0 bytes, in 1.31269 seconds
0 mean bytes/connection
0.761794 fetches/sec, 0 bytes/sec
HTTP response codes:
./http_load: SSL connection failed - 0
9176:error:140943FC:SSL routines:SSL3_READ_BYTES:sslv3 alert bad record mac:../../../../common/openssl/ssl/s3_pkt.c:1057:SSL alert number 20
https://ogma.czech:443: byte count wrong
./http_load: ran out of connection slots
Changing synopsis to clearly specify the problem for better future references.
Work Around
Workaround removed since it was for 6554248.
The workaround above was just for the DH problem, not for the multithreaded environments. I don't know about any workaround for that now.
Comments
N/A