Tau Development Update | October 2021 (Ep. 36)

11 min readNov 18, 2021

Karim:
On the Agoras Live side Andrei is preparing for the release. He did some major refactoring of the backend with an eye on scalability and making the Agoras Live backend more cloud friendly. And in doing so, he managed to improve the performance of the backend by about 20%. He’s also working on creating the test plan for the Agoras Live application as we prepare to hire some testers to do some of the testing of the entire application. A minor improvement, he removed reliance or the need for logging in very early on to the Agoras Live website allowing people to browse freely the application before they commit to creating an account.

On the TML development side, Lucca continued working on his two CNF extraction project. He did manage to implement two CNF extractions from BDDs which in itself is no small feat. Right now, he’s working on continuing the integration of the two CNF within the rest of the BDD library with an eye on making it transparently blend between BDD and two CNF. Which should avoid the BDD exponential explosions. In doing so, he also refactored some of the storage implementation gone from using maps, STL maps to STL vectors. Ran into a small problem with garbage collection, but we helped him with that. He’s made some great progress on the two CNF’s extraction.

Murisi as usual has outdone himself again this month. He actually found some performance bottlenecks in the TML core library. And managed to improve performance by an amazing 40 times for some programs and about 37% for many other programs by removing some memory allocation bottlenecks. He also finished working on the variable shifters transformation which also helps with the memory savings within BDDs as well. In doing so, he had to make sure that he didn’t break canonicity so that was a tricky thing. Well done Murisi again on those performance improvements.

Tomas continued his work on the state block transformation which is really usability features for TML and makes TML a lot easier to use and allows programmers to control the flow of the program a little better. In doing so also, he made sure that it interacts well with other transformations such as nested programs. Tomas is also responsible for the IDE which is currently the best way to access TML. For those of you who are familiar with the old TML playground, I think from now on we’re going to be using the TML IDE instead. I strongly encourage you to play with that if you haven’t already.

Juan continues his work on the arithmetic-built ins such as the count operators and the basic arithmetic operators. He’s also helped test conversion to Bit Universe transformations that Omar has worked on for a long time, so he helped testing with that.

Professor Christopher Benzmüller:
Again three lines of activities: The first one is the academic panels where in the last month I would say we shifted our attention in addition to the usual discussions to reflection over the development process of Tau, The intended evolution process. We try to mutually challenge our current proposals to find out whether we have any limits and aspects that we haven’t sufficiently addressed yet. This process is ongoing. My particular interest is in particular in the reflection of how expressive an initial Tau language should be in the proposal we make, and whether we have a clear methodology to transition in the evolution process which is not part of let’s say the first implementations, but which should be eventually discussed because there might be constraints imposed by the evolution methodology imposed on the first shot for a Tau language. This is an interesting and very stimulating academically inspired discussion and we continue to learn a lot there. In order to foster a discussion next week, I have been working on a simple example that could guide this methodological discussion, maybe starting from a very small idealized little Tau language to a more elaborate one. Which we could eventually discuss in the next week how something like that is intended to work out in the long run. In addition, I’m working on this external reasoner that I shared with a colleague last month on GitHub and gave demos about it. There are little examples we work with. Last month in particular I focused on how to fix some issues and about dealing with free variables on tableau branches in the model finding process where I still had an issue to solve. Now the interest will be to check this model finding performance and theorem proving performance of that implementation also for other benchmark examples and to eventually find out whether there are further problems to be addressed.

Dr. Cramer:
We have this idea that originates from Ohad that given any particular language that gives us the means to express the things that we need for the everyday usage of Tau. We want to have on top of that a richer language that additionally allows us to speak, for example, about consistency of expressions in this language. This is required for discussions about changes to the system when you want to make sure that when you change the system that, generally first, it’s phase consistent and internally coherent. You may want to require changes that are more significant to the previous version, so that’s somehow inconsistent with some core of the previous version that you want to be rather stable. Those are more difficult maybe to do. For example, maybe you really need a much higher consensus for these kinds of changes than for other changes like just changing some details of the system without really touching the core. But for that, you also need to have this notion of consistency with the core functionality of the system whenever you consider a change.

Ohad has proposed a certain methodology based on Boolean Algebras to basically add this kind of consistency to any given language that we might have at any specific time for expressing what we really want to do in everyday business. On top of that we have a more expressive language that expresses this consistency. We’ve discussed the pros and cons of that particular proposal by Ohad and also an alternative proposal put forward by Christoph and myself that involves the methodology of higher order logic. Our alternative makes things, let’s say computationally more challenging but also has less limitations on the side of expressivity.

We really need to find a good balance between expressivity and computational cost. Which is not always easy to find a good balance but that’s why we have to really think hard about what precisely are the use cases we need and in which context we really need to compute which things, and so on to get the right balance here. This will allow us to choose which of these different proposals we will implement in the Tau system.

Juan:
I shifted the focus for the last several weeks from second order logic towards our arithmetic capabilities. In particular, I solved all lasting issues with our count built in which effectively should allow us to implement cardinality constraints in our formulas and in our rules. I’ve been digging deep there and improving/fixing issues on the account building and extending the possibilities for the actual TML programmer. This work is on top of what Tomas did many months ago around the refactoring of our built-in features. On that aspect of our system, I’m also integrating what Omar has been working with the Bit Universe transform in order to complete it with the arithmetic primitives or operator for addition and duplication and also for inequalities in the sense of less than equal.

These are relevant and promising developments on the TML core to really improve its Boolean Algebraic capabilities and yeah very likely have improved performances on many different use cases. I will point out the last comment regarding some questions both on our Telegram channel and on the monthly questionnaire. The second order logic is still for internal usage. We are still solving some relevant points there that are not ready for general user testing. So, we will ask you a little bit more patience. I mention we reach a decent milestone, but we still need to figure out how to deal with the correlations between let’s say first order structures implied by the second order variables.

So, there are points to deeply consider, but it’s still on the design table and we will be iterating over them in the short term for sure..

Lucca:
This month I managed to finish the first phase of the two CNF extraction feature. And by this, we mean that we are now in a position to extract the two CNF via the BDD is built. And as a second aspect to have the correct interaction between the BDD storage and the two CNF storage. This was quite a challenging task to achieve but we now did it. In the further progress, it is planned that the storage of the two CNFs is reviewed, so that we have the best possible performance with this new feature. Once this is in place, we’ll get to the performance improvements of the TML language itself.

The second project I’ve been starting this month is looking into dependently type programming languages. Because the new language for Tau eventually will maybe go in a functional direction. And for this reason, we want to early understand how side effects are treated by an independently type programming language and I have started my journey in Acta which is a type of dependently type programming language.

Andrei:
Last month, I finished some outstanding basic functions for our Agoras Live platform. I also was searching for some professional help with the testing of the platform. While searching for such people, they requested documentation for the platform. So, I had to put aside everything and start writing the documentation for the backend. Along the way, I did a massive code cleanup and code refracturing especially on the backend side of the platform.

I moved our backend to some faster backend framework. It’s more than 20% faster. Technically speaking, I moved from node.js express to Fastify. Also, I removed all obstacles to run our backend stateless service. It will allow us to quickly deploy our backend to services like Sirocco or another in the nearest months. It will allow us to quickly scale horizontally to any demand, any laws. Next month, I plan to finish finally on this swagger documentation to get ourselves a proper tester.

Tomas:
I’ve been working on State blocks (https://youtu.be/gqwUvSKGPy0?t=1161). State blocks are helpful for better structuring of a program which is going through many states and they are especially useful when you have a state you need to visit repeatedly. State blocks are implemented as a transformation which adds a state block turn into bodies of everything inside the block. So, for example these effects are added only when there is an ‘init’ fact in the database. These rules are executed only when this state runs them for this state exists on the database. It does just the transformation that it adds into the body of the rule.

This way this rule is executed only when the state is active. If there is this deletion sign after the label of the state, it is called “flipping state”. This sign adds a new rule into the program saying that when the program enters into this state, then it removes the state term from the database. So, it gets transformed into something exactly like this. Which means if there is a finished fact in a database, remove finished fact from a database. This makes this state to last only one step. This particular program contains three state blocks, ‘Init’, ‘run’ and ‘finish’. Here the program says to enter the ‘init’ state. And in the ‘init’ state we add into data. Which is a graph represented by relation E and we move on to a state called “run”.

Then the program executes these two transitive closure rules and it runs until it reaches a fixed point. And this would normally mean the end of the whole program, but if we want to do more stuff then rules in some states reach a fixed point. And there is a command line option called Fixed Point Step it is -FP-step or just -FP. This gives the programmer a chance to detect or fix it point. If this command line option enabled if a program reaches a fixed point, it adds a term ‘ — FP — ‘ into the database and runs one more step of the program. And this way programmers can create a rule, which is executed when the program reaches a fixed point.

You can see in this rule, when this fixed point happens, the program removes run term and adds finish term into a database. It also removes the fixed-point constant. And then this cleaning crew is executed, so only decimation is contained in the database at the end. This program gets transformed into something like this. So of course, you can write states in this TML way, but when you need to write more complex state machines then it starts to be very difficult to read.

Let’s see. Program executed. You can see the programs are identical. Here is another example of a program using state blocks. This program sorts data in relation A. You can see you can nest states, but it comes with more complex state transitioning.. You can see that the original is ordered here. The first argument here is the order of the data. So, you can see this is the first, second, third, fourth, the following.

Here is the last example. It’s an algorithm for solving Sudoku games. (https://youtu.be/gqwUvSKGPy0?t=1508). This way we enter all known positions. Here’s some initialization. Here’s the backtracking algorithm. These are the states we are going through many times. And this way we go through all the game positions and we try all possible values and anytime we find invalid path we go back and try the next number from the row. So, this is the last example. So, let’s see the execution. Here you can see the final solution.

Murisi:
This month I was mainly working on improving performance and memory usage. One of the things which I did was extend the variable shifter catching mechanism. So, that we can essentially avoid doing certain kinds of computations when we can identify that they’ve kind of already been done in the past. That’s now done. This may improve certain kinds of workloads. I know that certainly the theoretical gains. But I guess we still need to kind of identify the workloads where this optimization performs the best.

Another thing which I worked on was identifying performance bugs in the binary decision diagram implementation. And essentially by just fixing the core binary decision diagram implementation, I was able to attain a 43x times improvement for some TML programs and still some simply good performance improvements in other TML programs as well. But maybe not as much as 43 times. So, a mixture of performance improvements. And this was because these bugs were mainly to do with memorization of the results of previous computations. The core library was kind of forgetting the results of certain computations causing the number of computations to exponentially kind of explode. Which caused a huge kind of performance drag. By fixing that I was able to indirectly improve the performance in TML programs.

I also managed to kind of remove some other bottlenecks which were taking time. And I also managed to kind of improve the way the engine factorizes certain kinds of computations in order to improve performance. Now that I improved some of these issues in the primary design background library some second order issues started to crop up. In particular issues to do with memory allocation. Memory allocation began to take the most amount of time in terms of our TML program execution. I saw that if you replace the sophisticated memory allocation mechanisms; just a simple bump allocator, it just kind of dumbly takes a little bit of memory from the operating system. The performance of TML programs also greatly improves. Some programs are improving by 37%. I’m hoping that next month I will flesh out this optimization more and I will make it such that it integrates better with the TML code base so, that we can all enjoy faster performing TML interpreter.

Ohad:
I’ve been involved in everything or almost everything you heard here. Just one thing that was not mentioned is the past reconstruction from the early parser which I wrote. I put some thought into the various options of how to do the parse forest . I also shared this research to the team specifically to Omar and Murisi to see how they can take it further.

Tau Development Update | October 2021 (Ep. 36)

Written by Tau

No responses yet