**THE COMPLETE ARITHMETIZATION PROJECT -
PART III: UNIVERSAL MULTIPLEXERS**

*T. E. Raptis*^{©} 2020

** **

A PDF with all three parts is available
*here*

**Introduction**

In the previous parts we described how to make a hierarchical evaluation of certain symbolwise functions on binary integers that can be seen as abstractions of a variety of low level automata. We also showed some simple examples from combinatorics where arithmetic fractals and other interesting sequences were derived of who's the regularities are considered to be inherited from the multi-periodic nature of an underlying counter system endowed with a primitive form of self-similarity with a wreathing product structure.

What we shall deal with in this last part is more theoretical concerning the possibility of transferring the previous into a more abstract algebraic content where the set of all programs can be viewed from a different angle related to dynamical systems theory. We ask if it is possible to analyze the paths (history) of every computation as a whole in a way that would reveal the important role of the underlying structure inherited by the wreath product. The reader may want also to consult again the math appendix at the end of Part I where the evidence is presented on the origin of this effect.

It deserves to mention here previous efforts which were based on
experiments with some modern versions of combinator calculus that recently
appeared as a visualization of what became known as a
*Turing Tarpit*
and which also provided some evidence on a kind of
fractal structure constraining the set of all outputs obtained this way. There may
be also some association between the approach presented here and previous
efforts in so called,
*Supercompilation*
as first introduced by Valentin Turchin to be incorporated in his version of the
*REFAL* functional
language. In order to make an abstract model out of the previously
introduced Hamming Hierarchy we shall need to move to operator language.

**Non-commuting multinomial expansions**

** **

Recall the previous construct of a sequence of Mersenne intervals *S _{L}*
and an associated sequence of

|0|à [0 0 ... 0] à *m*_{0}( *m*_{0}( ... *m*_{0}(
*x* )...))

|1|à [1 0 ... 0] à *m*_{1}( *m*_{0}( ... *m*_{0}(
*x* )...))

|2|à [2 0 ... 0] à *m*_{2}( *m*_{0}( ... *m*_{0}(
*x* )...))

|3|à [0 1 ... 0] à *m*_{0}( *m*_{1}( ... *m*_{0}(
*x* )...))

...................

There is now an additional difficulty due to the arbitrariness of domains and co-domains of these compositions which can only be treated with branching and this is treated with an additional construct in the next section where the full UMUX method will be developed. At the moment we assume the maps to be somehow valid on all integers by some internal consistency mechanism to simplify the discussion.

Such a scheme establishes a 1-1 correspondence of all possible paths
of a computation that could be performed with a list of some elementary maps
over all possible inputs as in product of the form *H* x *H*. Such
entails a particular kind of product between the
matrices
for the symbolic path expressions and their associated intervals. Having
examined the arithmetization of certain simple maps in Part II of this study we
may also assume all such maps being strictly quantitative as m: *N* à *N*.
With this simplification we may posit which
in practice means that we only have to deal with a set product of all paths as
symbolic words in a base *b* alphabet with all possible arithmetic inputs.
If a list of maps of length *b* is sufficient for universal computation
then the same correspondence will always be true independently of the
existence of any fixed points or otherwise said, of some *halting *state*
*assuming an unbounded hierarchy of intervals*.* The global output for
such a procedure can be given as a hierarchy of 2D arrays or *Universal
Global Maps* where each level of the hierarchy now represents a new step. Moreover,
if an additional tag or special bit is preserved for any single entry to have
been visited once per step, then the equivalent Boolean array indicating such a
visit must by construction be a total permutation matrix. An interesting open
question is then whether there do exist classes of universal FLUTs giving rise
to a recursive, or even arithmetic fractal graphs out of the underlying
transverse counter set as explained in the next paragraphs.

Let us now make this more systematic via the following algebraic
trick. We can make use of the fact that functional composition is in general a
non-commuting operation meaning that we are not in general allowed to write. Then we can directly assert
that the LUT based construct is identical or isomorphic for any level *L*
with the expansion of a multinomial of the form

where the symbols *m _{i} *are to be interpreted as
functional operators in general and {ij...k} is a word obtained from a column of
the associated LUT. The only case where this could be reducible to a standard
multinomial with many paths collapsing to the same multi-index space can be seen
with an abstract analogy from linear algebra. Assuming then arbitrary linear
maps over some vector space

What is more interesting here is the fact that the primitive
self-similarity which is inherited from the underlying multi-periodicity of the
counters for each collection of symbols may influence the thus obtained 2D
global maps. While it is difficult to give a precise quantitative measure of
such an inheritance, it is possible to see where such an influence originates
with the aid of an additional device. We can in fact study the formation of all
words of indices in the summand above via an additional discrete "time" *\tau**
*governing the underlying counter system as say

Such is a type of '*transverse*' inner flow which permeates all
possible computational paths even though one may have to reach only a single
one during a specific computation involving also some unavoidable branching
(more on this in the next section.) Additionally, there is a possibility of
generalizing this when dealing with such maps that may belong to the elements
of some discrete group of symmetries. There is at least one other property that
may be crucial whenever there is a representation allowing to make use of it. One
such example comes straight from number theory where action over elements can
be transferred to the action over their indices and it is known from the area
of so called, *Fibonacci Primes* where we have the following unusual
property for the action of the *gcd *function as. It would an obvious
advantage if this could be generalized for a specific list of fundamental basis
functions that can collapse computational paths as say. This is still an open
research question at the moment.

**The Universal Multiplexer (UMUX) method**

Suppose one would like to use the previous method to analyze certain
low level computations of which the global maps could reveal also possible side
effects and in general allow for an alternative type of what is known as '*formal
verification*' at least at a primitive level. In order to deal with the
general case one needs to find also ways to invoke branching in an effective
way without breaking the algebraic structure in the original scheme of the
previous section. A slight modification of the original then can be given with
the aid of what is known as the fundamental theorem for
*structured programming*.
The essence of it is that we can cast any algorithmic expression of a well
structured program into the standard format of a single loop with a large *switch...case...*
conditional inside. Loops can also be turned into singleton *if...then*
conditionals using a counter and recursion. We may consider a particular normal
or canonical form of the above whenever all operations inside any structured
program can be adjoined with an equal set of appropriate disjoint Boolean characteristics
or indicator functions
over any
interval of possible inputs. We already wrote about the need to refer to pairs
of integers when working on the Hamming due to the imposition of ad hoc
expansion lengths greater than their natural bit length. In analogy, we may
talk here of a triplet representation requiring both { |x|, L_{|x|}, \chi(|x|) }
independently of its expansion to any arbitrary alphabet
where, (|x|) is used to denote a property or
even a set of properties as say, in a generic Boolean condition over various
indicators as . Application of the
above must be done with caution in special cases that appear incomplete as in a
nested statement of the form

*If Chi1
do f_{1} *

*if * *Chi*2 *do f*_{2}
*endif*

*endif*

In such a case a transformation is required as and.

All the above can be nicely summarized as a total algebraic operator of the form

By a slight abuse of notation we interpret all compositions
as a kind of "convolution" of a total (vectorized) set
of inputs by a Boolean vector which "windows" out all invalid inputs from a
span of *m* sub-domains according to its own indicator.
This is basically the same as tagging all input data according to
a quantifier which is also a function of itself this being a strongly
non-linear operation. In practice this leads to a kind of multiplexing
technique where each input is redirected to a different map. One may call this
an algebraic pipelining method.

In most cases, where all maps are restricted inside certain
sub-domains of validity, their co-domains may become disjoint at some point. In
general not all paths available from the original matrices will be populated all the time. This can better be understood with
a network diagram where all map outputs may have arrows to all others for as
long as the relevant associated indicators allow that in sufficiently large
input intervals. Collecting each path for all sub-domains at every step will
leave a trace of words in alphabet *b* or their corresponding arithmetic
indices. Depending on the nature of the maps the set of these indices may or
may not cover the whole interval *S _{L}* in successive steps and
successive input intervals and whenever it does we may call the path space "

Some generic properties follow immediately. Assume any list of
indicators to form a perfect span of any interval *S _{L}* by
forcing at least one of the maps say

where. It is also possible to apply a similar logic in order to derive a type of probability associated to each path as

where the notation again implies summing over all outputs filtered from the original input set reached through the same path.

**Uniqueness, irreducibility and factorizations**

Taking a second look at the finally obtained expression of *Z*
allows realizing that it works much as a prime factor say, like *2 ^{L}*
or

.

Trying to match them would then lead to a product much like in a factorization as

On the other hand this implies that all causal paths created this way should belong to a subset of a more complete operator over a quaternary alphabet like

By what was said on the fundamental theorem, the two last
expressions should be somehow equivalent that is there should always be a way
to write things in a "prime" format. The only way that these two could be
compatible is by additional compositions of indicators which lead us to the
territory of the general
*Satisfiablity*(SAT)
problems such
that the set of all possible paths is appropriately "clipped". When and how this
is possible depends strongly on the inner structure of all SAT problems the
solutions of which might also be influenced by the same wreath product
structure underlying any Hamming hierarchy. That is, there will be cases where
one can always factorize a total *Z* operator whenever the associated SAT
problems "resonate" with specific subsets of paths. Equivalent, predicting what
will be analyzed in detail in next sections there will be a stochastic transitions
matrix of whom several entries will be zero corresponding to symbols of operators
that should not "meet" thus effecting the same clipping.

Unexpectedly, this might help to shed some light to an apparently
remote problem in physics, that of the so called "*entangled*" states
where factorizability is a crucial issue at the cost of interpreting all
particles states as being associated to the causal paths of a massively
parallel execution of an underlying machine. Moreover, such a machine would be
required to curry over a '*self-modifying*' type of code at least with
respect to the indicators which should also involve what is referred to as an
act of '*observation*' which of course is kind of a cyclic argument since
any observer would be part of the machine! The mentioned act of clipping the
set of causal paths also has the ability to present higher correlations when
looking at the set of all possible outputs than if all were available to
execution as if there were inner constraints. The same could look like a kind
of "trickstery" where an observer without access to the true indicators may be
led to believe to the existence of "invisible" or "hidden" interactions. Of
course, many more things are to be added to such a construct regarding
additional symmetries, etc., but we have already seen in Part I that the
underlying wreath product structure already contains "for free" a sampled
version of two important ones, specifically a discrete version of *U*(1)
(to be found in all 'projectified' Hilbert spaces where a 'phase' cannot
be measured)and the "*boost*"" sector of a Lorentz group.

**Implementation issues**

To use the above in any efficient manner without any need for a
branch conditional we need an additional recipe given a selection-redirection
mechanism for all input data *x* according to an auxiliary FLUT of
indicators. To actually make use of this, we need
a method for making arbitrary permutations of such tagged inputs from any
interval across the hierarchy to be fed into an ordered list of maps that can
be efficiently coded via a cell of anonymous functions using the standard
Matlab conventions. There is one last tricky point on the inclusion of a zero
arithmetic input to any map due to the double character of this symbol per se
being simultaneously a possible input and an arithmetic indicator, the simplest
solution to this being a double shift. As an example, for an input interval
[0,...,2* ^{L}* - 1] and an indicator as (

As in the previous two installments, we are more interested in
arithmetized primitive operations on patterns but for similicity we can use
here an example of a well known case of branched function, the one giving rise
to the
*Hailstone*
sequence for which we only have two
complementary indicators making a full span of every interval (no need for an
additional *Id* map). A simple example script showing the combined operations
is shown below for the *S _{8}* interval

clc, close all, clear all

L = 8; % iteration length

F1 = { @(x)x/2 @(x)3*x+1 };

F2 = { @(x)1-mod(x, 2) @(x)mod(x, 2) };

base = numel( F1 ); S = base^L;

xin = 0:S-1; xout=zeros(1,S);

seq = ones(1, S); w = [];

for t = 1:S %reduce this to L or similar for less memory consumption if need be

for i = 1:base

idx = F2{i}( xin ); y = idx.*(xin + 1) - 1 ;

xout( find( idx ) ) = F1{i}( y( y >= 0 ) ); % automated sorting

px(i, t) = sum( idx )/S ; % normalized indicator densities

end

xin = xout; w = [w; idx]; % path traces

end

figure( 1 ), hold on, for i=1:base, plot( px( i, : ) ), end, grid, hold off

figure( 2 ), imagesc(1 - w), colormap gray, pause(0.1)

Figure 1 will show up all intermediate parts of this sequence along all initial conditions. The evolution of the indicator densities is shown in the next figure

The equilibration we see after a transient interval is due to the
well known fact that all trajectories in this sequence finally converge towards
a 2-periodic attractor, something which has been verified for extremely large
intervals but not formally proven, also known as the
*Collatz*
conjecture. The second figure shows all paths followed by the
binary indicators for all initial conditions as a 2D graph of a *2 ^{L}*
x

These can be further analyzed via a direct run length analysis or
use partial contractions along each dimension to locate possible sequences of
interest.
In the primitive example used here, there is a randomization of
blocks due to the fact that one of the branches is actually used to eliminate
all prime factors of 2. All paths could be made into simple alternating
sequences as 0101... or 1010... sequences for even and odd positions respectively
by just using a direct factorization of each new input at the first branch and
simultaneously eliminating any power *2 ^{k}* without affecting
convergence. Lastly, it should also be mentioned that the sorting method used in
the script is not yet fully arithmetized while a better or more consistent method
to the spirit of present analysis would be to utilize the 'viral' encoding presented
in Part I so that all input data are formed via a concatenation with their index or
'pointer' while only the 2nd part is actually passed through every individual map

**Branching via complexification**

The approach explained in the previous section can be applied either to an interval or to a single input. A remaining question is the possibility of further formalizing it in a manner independent of the specifics of any programming dialect. Interestingly, an algebraic alternative in the implementation of the previous section would be simply a product like or a summand of powers like where the constant is to take care of the residual summand of all invalid inputs. The problem with any such expression is that no protection from side effects is offered for possible invalid inputs which are simultaneously fed unless there is a specific axiom of precedence which evaluates exponents first and nullifies any further evaluation apart for a single member each time but this is something to be added as an extra axiom hence these expressions do not appear self-sufficient.

One other road towards a safe and compact global branchless
expression for any input appears to be an interpretation of the choice of
branch as a "rotation" which entails at least one additional dimension. One can
do so (always assuming only arithmetized integer inputs) by promoting any such
into a complex pair using the set of *b* roots of unity as

,

with the exponent always in the range [0...*b*-1] being defined
by the corresponding set of indicators as a product or a summand form

Notice that at least the first part allows connections with a
special linear vector space via the transform where
now ** k **is a vector containing the logarithms of [1,...,

While the above is not the easiest way to implement a structured
algorithm it does offers some additional theoretical opportunities. In the
simplest case of a binary choice as in the example of the previous section one
can project paths in a particular geometric structure in the complex plane
where the use of some phase like {-1, 1} or {-**i**, **i**} does not
matter at the moment unless a particular choice makes more sense. The main theoretical
question which is also one of the ultimate reasons for writing this part
follows next.

**The totality of causal paths **

As already noticed in Part II of this series, the totality of causal
paths may still hold some secrets at least for the general class of *N* à *N*
fully arithmetized maps that can give rise to some fractal or otherwise
multi-periodic sequences. This brings about a natural question on the biggest
possible subset that could behave in an apparently constrained way including
the set of all possible fixed points for those that finally halt. Another
technical question is how to make this question precise in the UMUX language of
the two previous sections. More general discussion of such an effect can be
found in the last epistemological addendum.

Using the material from the previous two sections we may directly compare an arbitrary path composed from the decoding of a large word in the picture of the underlying transverse counter flow as

What this implies is a crossing of two "transverse" discrete times
one being defined by *t* as the time of a specific computation on some
valid path and the other as a discrete "free" counter time over all possible
paths. At the moment we may also make the simplifying assumption that all
functions *f* do not suffer from side effects so that the choice of
indicators influencing *n*(*x _{t-k}*) form a free span of a
large interval. It is natural to ask to what extent and under what conditions
would some properties of the underlying word {

Since such is too vague a question, it is preferable to resort to a
more detailed analysis of the kind of dynamics represented by the algebrization
of structured algorithms. The first to ask would be the overall density of
paths as *|Z*|/*b ^{L}* for any total state space. For a simple case of a
dynamical system with only binary branching which does not behave as a random
number generator this might represent a very small fraction the larger the
expansion length

A relatively trivial example can be given again with the aid of the
Hailstone sequence where the iterative pairing of *f*_{0}(*x*)
= *x*/2 with its native indicator (*mod 2*) and its complement gives
a result we have already seen in Part II in the case of the exponent of the
first order factor in the factorization of any integer in any *S _{L}*.
In any such case we take zero as a fixed point. The second map

clc, close all, clear all

L = 10; N = 5 % Interval exponent and iteration length

F1 = { @(x)x/2 @(x)3*x+1 };

F2 = { @(x)1-mod(x, 2) @(x)mod(x, 2) };

base = numel( F1 );

S = base^L;

x = 0:S-1;

y = repmat( x, base, 1 );

for i=1:basefor image=1:N*L

y(i, :) = F1{i}( y(i, :) );

mat( i, 1, image ) = sum( F2{i}( y(i, :) ) )/S;

mat(i, 2, image) = 1 - mat( 1, i, image );

end

end

x=1:image; plot( x, squeeze( mat(1, 1, :) ),'.', x, squeeze( mat(1, 2, :) ),'r.',...

x, squeeze( mat(1, 2, :) ),'g.', x, squeeze( mat(1, 2, :) ),'m.' )

The resulting figure shows the output being indeed of an alternating nature.

What really happens here is that a transition appears after the 10^{th}
image the reason being that at this point all zero blocks in the binary pattern
of any initial condition in *S*_{10} signifying the presence of a
factor 2* ^{C}* with

**Substitution Systems**

Generic substitution systems present a different problem regarding
their hierarchical structure due to certain ambiguities and structural
differences. Some arithmetized substitution methods were presented in Part II. Leaving
aside various technicalities regarding
*confluence*
and normalization one may start from a universal extension of the original *Thue *grammar
which forms the 0 level of the Chomsky hierarchy of languages assuming no
difference between random or sequential rule application.

Let then
be a standard (*R*_{T})
LUT for replacement rules in the sense of
*w*_{L} à *w*_{R}
with standing for the equivalent
numerical value of each word and *k*_{L,R} stand for the
equivalent string lengths for each word where action starts from an initial "*axiom*"
string. Some care is required in that these may not necessarily coincide with
the natural length *l*(*w*) as defined in the previous parts due to
the possible presence of zeros at the end. An important property can be used to
classify all such systems via a parameter given by the ratios of their
effective lengths, *i* = 1,..., *N*
as "*contractive*" if all , "*conservative*"
if all *\lambda*_{i} = 1
and "*expanding*" if there is at least one *\lambda*_{i} > 1. For the first two cases, we
may still talk of endomorphisms inside each matrix
for each interval *S _{L} *of all possible initial axioms while the
last one may act as a surjection on some higher interval for some steps even it
finally converges on the original or smaller length. Especially for the
conservative case it is evident that we can write an equivalent arithmetic
formula of any replacement for the integer values of all axioms as

where *k*(*x, x*_{L}) denotes a special function
for finding the lowest symbol where replacement is to take place. We can show
that all these obtain a canonical form due to the special properties of any matrix. Interestingly this
case can be recast as a discrete derivative where
*Dx* is used to denote a vector of signed integers replacing the original *R*_{T}.
Additionally the powerset of all possible subwords coincides with the set of
all proper submatrices with *L*
the total axiom string length and by simple counting over powers of some base
we get a total of *b ^{L}* - 1 subwords that could be taken form an
additional arithmetic interval. Thus one
should in principle be able to define a total

To recast any of the above in the originally proposed multinomial
form *Z* one should first obtain by some method a set of indicators
forming a proper span of any interval *S*_{L}. While
it seems possible to associate any pattern locator with an indicator
and the replacement with the arithmetized map *m* there are
additional difficulties regarding the originally assumed convexity of the set
of all indicators in any interval.

To better understand the necessary decomposition we should further
analyze the structure of the lowest symbol pointing function that we denoted as
*k*(*x, x*_{L}) above. Suppose for instance a rule like '100'
à '01' that requires finding all such instances for all strings
across any interval *S*_{L} fixing *b* to the binary case.
But we do know that each segment of any length *k* < *L* will
inevitably repeat periodically across all columns of any
and if we set a first pointer
at the beginning of any string than that period would be simply *2 ^{3}*
for the example given. From the symmetries endowed due to the wreath product
structure the only difference for the same rule pattern in higher positions will
be a set of shifts and dilations of the original sequence at position 1 by some
period multiplier. Thus we may fully characterize any indicator beforehand thus
totally avoiding the need to separately applying an individual filter as some

w=zeros(n, 2^n);for i=0:2^n,w( strfind( fliplr( dec2bin(i, n) ), substr ), i+1 ) = 1;end

imagesc( w ), colormap gray

For instance, setting *substr* = '101' and *substr* =
'0101' for *n *= 7 gives two Boolean arrays as in the figures below where
each white dot signifies the initial position of each substring while certain
overlaps inevitably appear for strings containing periodicities in their total
patterns. Each row of the below arrays stands for an individual total indicator
over the interval.

There is now a certain ambiguity on the simultaneous handling of
such overlaps which can only be ignored under the adoption of a precedence
axiom where the first occurrence is the first to be substituted. This of course
also assumes confluence but we may ignore technicalities here. Implementing
this actually means scanning each large string for no more than
substrings with replacements
at every *nk* + 1 new valid position. Using the notation of Part II for
arithmetized string manipulations, a generic update valid for all
(non-conservative) cases in all of an *S*_{L} interval can be
given using also the "pulse" function of Part I as

where now *k*(*x, x*_{L}*, i*) is any of the
matrices obtained for all possible instance of any subword *x*_{L}.
Despite the regularities of the shifts and the periods *p _{i}* of
each row the general case presents difficulties in locating exact numeric
formulas due to the high dimensionality which is still an open question.

**Agents in the Warehouse: An epistemological addendum**

"This dewdrop world is a dewdrop world, and yet, and yet . . ."

Kobayashi Issa

Some philosopher the author does not recall once asked "How a planet
does know the law it has to follow?" Of course in the case of an algorithm or a
program we can ascertain with certainty that any such do not need to know
anything - they are purely *mechanical*. However, if a set of
computational paths appear to have certain similarities as well as the thus
obtained *chain of outputs* then it could be that people setting up an
effective computational process from different and in some metrical or other sense
"*remote*" initial conditions, strange correlations between such processes
and their obtained results could be observed. As for possible physical
implications, it may be worthy to repeat our original story in the introduction
of Part I in a somewhat different format.

Suppose you live outside some large warehouse with trucks coming and
going all the time leaving and taking packets. There is also always a large
irregular pile of packets outside by the end of each day which if it could be
precisely counted it would give you the integer 345 of whose the binary form is
[1 0 0 1 1 0 1 0 1 0] while if measured by the end of next day it would give you
1021 expanded as [1 0 1 1 1 1 1 1 1 1]. How easy would it be to know that this
pair hides a single transition of the famous universal
*rule 110* in the
Wolfram's classification of elementary cellular automata with periodic boundary
conditions not to mention keep doing so for each and every day!

Apart from this being a somewhat contrived, if old fashioned method of some
agents communicating with hidden signals, it could take a very different meaning if
what was to be measured was a gauge indicator in some lab giving a sequence of
measurements like 0.345.., 0.1021.. etc. And of course we may nowadays have
achieved immense accuracies with advanced sensors plus theoretical predictions
but trying to measure the gravitational constant *G* with arbitrary
precision would immediately reveal part of the problem! The only conclusion is
that apart from the theoretical foundations of computational structures any
other assertion remains undecidable but not in the ordinary formal way of any
axiomatic system. It just seems out of the possibilities of any current empirical
science to verify such assertions like those included in the famous
*Church-Turing Conjecture*,
unless some purely theoretical "trap" could be set up of which some prediction
would lead to a major inconsistency.