[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06


  • To: Christian Lindig <christian.lindig@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Edwin Torok <edvin.torok@xxxxxxxxxx>
  • Date: Fri, 28 Aug 2020 17:43:29 +0000
  • Accept-language: en-GB, en-US
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxx>, "dave@xxxxxxxxxx" <dave@xxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>
  • Delivery-date: Fri, 28 Aug 2020 17:43:44 +0000
  • Ironport-sdr: kZuKNfLftCuyrsl9aZ+9HOit7l7n2ZePiS+eEw/lU7u+u/Bj50WaVW1OogXCRe86hnK1l8m5Ga bUWDOF/X5TDQLp/GcMnwbGdbKHzFDSm7sVHTyO8z2uwi8yGO7hlqu5mYOWnLRJiN5NxPKNUoyH 44G2sixtr7r9BCgZ6lRHE+vQbf+PrdhhyoptWdoLJE0hDyQY1gqnNWMQnCrcNWL6/5Pgc1PMKV w1pOfq331USMrlJg8FAQGzeYlet1uYRzy0SW0PhKn/+NJ24J2aXR1t9ltHqWgjVkaw4s59NxN4 2ug=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWfJh3gJ2m+Ivd3EG0JYd/oTowFqlNEDCAgACakQA=
  • Thread-topic: [PATCH v4 2/4] Map: backport find_opt/update from 4.06

On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote:
> +let find_opt k t =
> 
> +       (* avoid raising exceptions, they can be expensive *)
> 
> +       if mem k t then Some (find k t) else None
> 
> 
> 
> I disagree with this argument. Exceptions in OCaml are cheap because
> they don't walk the stack and cut to the exception handler directly.
> Is there a reason why they could be expensive here? In any case, the
> code is correct.
> 

Interesting question, it is best to measure.

The answer depends on whether the key is found or not in the map.
I'll change the impl to the one with find+catch, I think we  might be
looking up keys that are present more often than those that are missing
(although the 4.06 series will make this moot, 4.06 version is faster
than both approaches, regardless whether key is present or not).

To sum up the measurements below:
If the key is not found then my approach is faster (takes only 80% of
time), if the key is found then find+catch is faster (again an approx
80% of the time taken).

I based my comment on the documentation of raise_notrace, which says:
"A faster version raise which does not record the backtrace.",
which implies that recording the backtrace has a measurable perf
impact.
One could argue that if performance matters backtraces should be turned
off in production, but I think the value of having backtraces when some
hard-to-reprodue bug occurs outweights any perf penalty.
We should try to use exceptions only for unexpected situations though,
not finding a value in a map doesn't qualify.

See the attachment for a small micro-benchmark:
$ dune exec --profile=release -- ./updatet.exe raise
Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'.
┌───────────────┬──────────┬────────────┐
│ Name          │ Time/Run │ Percentage │
├───────────────┼──────────┼────────────┤
│ raise         │  33.52ns │    100.00% │
│ raise_notrace │  19.16ns │     57.16% │
└───────────────┴──────────┴────────────┘

So raising with a backtrace is measurably slower, taking the backtrace
spends some CPU cycles.

$ dune exec --profile=release -- ./updatet.exe find-opt
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬────────────┐
│ Name                     │ Time/Run │ Percentage │
├──────────────────────────┼──────────┼────────────┤
│ find_opt 4.06:10         │  49.10ns │     24.06% │
│ find_opt 4.06:100        │ 115.38ns │     56.55% │
│ find_opt 4.06:1000       │ 161.27ns │     79.03% │
│ find_opt=mem+find:10     │  50.48ns │     24.74% │
│ find_opt=mem+find:100    │ 110.39ns │     54.10% │
│ find_opt=mem+find:1000   │ 162.48ns │     79.63% │
│ find_opt=find+catch:10   │  89.10ns │     43.67% │
│ find_opt=find+catch:100  │ 160.80ns │     78.80% │
│ find_opt=find+catch:1000 │ 204.04ns │    100.00% │
└──────────────────────────┴──────────┴────────────┘


4.06 and mem+find take 80% of the time of find+catch.

But of course if the key is actually found in the map then we have
this: 
edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe
find-opt-found
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬─────────┬────────────┐
│ Name                     │ Time/Run │ mWd/Run │ Percentage │
├──────────────────────────┼──────────┼─────────┼────────────┤
│ find_opt 4.06:10         │  38.38ns │   2.00w │     52.65% │
│ find_opt 4.06:100        │  20.66ns │   2.00w │     28.35% │
│ find_opt 4.06:1000       │  20.63ns │   2.00w │     28.30% │
│ find_opt=mem+find:10     │  72.89ns │   2.00w │    100.00% │
│ find_opt=mem+find:100    │  39.06ns │   2.00w │     53.59% │
│ find_opt=mem+find:1000   │  39.07ns │   2.00w │     53.60% │
│ find_opt=find+catch:10   │  49.54ns │   2.00w │     67.97% │
│ find_opt=find+catch:100  │  33.01ns │   2.00w │     45.29% │
│ find_opt=find+catch:1000 │  32.97ns │   2.00w │     45.23% │
└──────────────────────────┴──────────┴─────────┴────────────┘

In this case find+catch is faster.

And here is update for a key that is not present:
$ dune exec --profile=release -- ./updatet.exe update
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  79.96ns │  24.00w │     17.27% │
│ update 4.06:100                   │ 171.96ns │  48.00w │     37.15% │
│ update 4.06:1000                  │ 243.95ns │  66.00w │     52.70% │
│ update=find+catch+add/remove:10   │ 183.46ns │  24.00w │     39.63% │
│ update=find+catch+add/remove:100  │ 340.00ns │  48.00w │     73.45% │
│ update=find+catch+add/remove:1000 │ 462.89ns │  66.00w │    100.00% │
│ update=mem+find+add/remove:10     │ 126.06ns │  24.00w │     27.23% │
│ update=mem+find+add/remove:100    │ 274.79ns │  48.00w │     59.36% │
│ update=mem+find+add/remove:1000   │ 401.62ns │  66.00w │     86.76% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Here 4.06 is a clear win, and mem+add is faster than find+catch+add.
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  72.76ns │  24.00w │     31.25% │
│ update 4.06:100                   │ 164.69ns │  48.00w │     70.74% │
│ update 4.06:1000                  │ 232.79ns │  66.00w │    100.00% │
│ update=find+catch+add/remove:10   │ 133.24ns │  23.00w │     57.24% │
│ update=find+catch+add/remove:100  │ 118.76ns │  35.00w │     51.02% │
│ update=find+catch+add/remove:1000 │ 161.22ns │  59.00w │     69.26% │
│ update=mem+find+add/remove:10     │ 156.29ns │  23.00w │     67.14% │
│ update=mem+find+add/remove:100    │ 122.98ns │  35.00w │     52.83% │
│ update=mem+find+add/remove:1000   │ 161.53ns │  59.00w │     69.39% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Interestingly here the 4.06 implementation is actually slower and not
much difference between my other two.

Best regards,
--Edwin

Attachment: dune
Description: dune

Attachment: dune-project
Description: dune-project

Attachment: updatet.ml
Description: updatet.ml


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.