Benchmark Results

The results of the nightly benchmark runs:
task all
task list
task next
task add


Potential Improvements

While performance has not yet been addressed in the 2.0 product, this list of potential areas for improvement is being gathered.

  • DOM lookup caching
  • E9 can reserve N/2 slots in the value stack
  • If a filter sequence contains only IDs and no UUIDs, then only pending.data need be read, unless inversion operators are involved
  • TDB2 can reserve space according to the number of lines in a file
  • Detect invariants in a filter, and short-circuit
  • Identify candidates for threading, which include: data load/parse, ViewTask rendering, E9::eval, TDB2::gc.
  • Emission of color control codes need to be optimized. If two adjacent cells in the output are of the same color, the output looks something like
    <red> DATA </red><red> DATA </red>
    This should be located and reduced to:
    <red> DATA  DATA </red>
    This will reduce the volume of output generated, and in need of reinterpretation by the terminal program.
  • Optimization of code hot spots such as: E9::eval, Task::parse, extractLines.


Arg:: enum Conversions 2011-08-21

Modifying Arg::_category and Arg::_type from std::string to enum has made a significantly larger gain in E9::eval performance than expected. Using the standard test data and queries, average filter times dropped from:

              task all    8514 → 5829 = 31%
              task list   6711 → 4644 = 30%
              task next   6587 → 4607 = 30%


New Baseline 2011-08-19

Taskwarrior 2.0 has several new components, which impact performance:

  • The TDB database accessor is replaced by TDB2, which provides a layered implementation (file <--> string <--> lines <--> tasks) which allows for on-demand parsing lines into tasks, rather than always parsing every line. TDB2 also implements the concept of the backlog, which are the collection of offline changes that are not yet synched to the server. TDB2 is expected to be faster, except when server communication is involved.
  • The filtering mechanism is replaced by a more complex, but significantly more capable E9 expression evaluator. This allows much greater flexibility in terms of filter complexity, by introducing operators, precedence, AND, OR, XOR and date math. The filtering is expected to be slower.
  • Sorting now uses std::sort instead of an implementation of CombSort11, and is expected to be faster. This is countered by the on-demand parsing, which may now occur during sorting, skewing results.
  • Rendering has been upgraded, from the old Table code that had several very inefficient steps: (1) it made copies of all data, (2) the data copies were involved in the sorting, which in the case of dates required re-parsing the values. The new rendering does not copy data, but uses references into the TDB2 data. Sorting is decoupled from rendering. This new ViewTask is expected to be faster.

The combination of the above changes means that it is no longer valid to compare 2.0 to earlier versions. Earlier data has been dropped, and 2.0 will be the performance baseline.


Standardized Testing Data

A standard data set has been defined for performance testing. 'Hamlet', Act I, has been loaded as a set of tasks. Each line spoken is a task description, all speakers are added as tags, any royal titles are projects, anything Hamlet says is logged instead of added, and all lines spoken by Ghost are annotations to the prior line.

This result is a nice mix of pending versus completed, short and long descriptions. There are 194 tasks, and will exercise the data load and rendering.

A nightly run of several reports against the standard testing data is being performed, and data is being accumulated. The performance results of this will be made available online.



Copyright © 2012, Göteborg Bit Factory.