# Infinite Tao

(b) Does the principle of infinite descent work if the sequence is allowed to take integer values instead of natural number values? What about if it is allowed to take positive rational values instead of natural numbers? Explain.

## Infinite Tao

Also minor pedantic note: the principle of infinite descent says that it is not possible to find a certain sequence. So if the principle of infinite descent does not work for a certain set of numbers, that means that it is possible to find such a sequence.

(b) The principle of infinite descent does not work if the sequence is allowed to take integer values. Indeed, define for . Then we have since (to show this more rigorously, we can start from , subtract one from both sides to obtain , then add to both sides to obtain ). Since we have found an example of an infinite descent, it is possible to have an infinite descent.

$$\bigcup_x \in X N_x = \Bbb N$$ as every $n$ has a value in $X$ and so is in a unique $N_x$. And if all $N_x$ were finite, so would the union be as $X$ is finite, contradiction; so some $N_x$ is infinite, and writing $N_x=\n_1, n_2, \ldots\$ in increasing order we have a constant subsequence with value $x$.

In the corresponding theorem for a single sum, you are somewhat limited in the "types" of rearrangements you can do. Specifically, a rearrangement of a single sum is required to reach each term in the original sum within a finite number of steps. For instance, it would be invalid application of the corresponding theorem for a single sum $\sum_n=1^\infty a_n$ to write it as $$\sum_n=0^\infty a_2n+1 + \sum_n=0^\infty a_4n+2 + \sum_n=0^\infty a_8n+4 + \cdots$$since such a rearrangement has infinitely many terms before it even reaches, e.g. $a_2$, and then doubly-infinitely many terms before it reaches $a_12$. This, however, is exactly the type of rearrangement that Fubini's theorem allows us to use.

How neural network behaves during the training over different choices of hyperparameters is an important question in the study of neural networks. In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization. Through both experimental and theoretical approaches, we identify three regimes in the phase diagram, i.e., linear regime, critical regime and condensed regime, based on the relative change of input weights as the width approaches infinity, which tends to $0$, $O(1)$ and $+\infty$, respectively. In the linear regime, NN training dynamics is approximately linear similar to a random feature model with an exponential loss decay. In the condensed regime, we demonstrate through experiments that active neurons are condensed at several discrete orientations. The critical regime serves as the boundary between above two regimes, which exhibits an intermediate nonlinear behavior with the mean-field model as a typical example. Overall, our phase diagram for the two-layer ReLU NN serves as a map for the future studies and is a first step towards a more systematical investigation of the training behavior and the implicit regularization of NNs of different structures.

Error propagation is a significant problem with the decision feedback equalizer (DFE) at low to moderate SNR where channel coding is employed. The main objective of this work is to optimize the DFE for a coded system by introducing stationary error models to compensate for the error propagation. The modified DFE (MDFE) differs from conventional DFE only in its tap values, but must be obtained as the iterative solution of nonlinear equations. Bit error rate bounds taking into account error propagation in the convolutionally coded system are derived for both the DFE and MDFE. Simulations studies confirm that the MDFE can yield 4 dB gain over the DFE in a typical convolutionally coded system without interleaving, and large reduction in decoding delay over the conventional DFE with interleaving and the same performance. An adaptive MDFE solution is derived which incorporates the error propagation model into the training. The adaptive MDFE is compared with offline and adaptive DFE's. The adaptive MDFE has the best overall performance in a convolutionally coded system. A first and second order stability and performance analysis show a close match between the theory and experiments. The general design and performance of the conventional DFE with constrained decision delay is also considered (without error propagation). This is an unsolved problem in the literature. We derive the optimal infinite impulse response (IIR) fixed delay DFE, which is shown to require a two-sided linear filter, in addition to the feedback filter. A study of the corresponding finite impulse response (FIR) DFE shows that whenever the number of feedback taps is less than the channel memory and/or the noise is colored (as is the case with matched filtering, or oversampling), a two-sided linear filter should also be adopted. The two-sided linear filter improves the noise whitening and phase of the equivalent channel, resulting in improved performance over the FIR DFE with feedforward linear filter and the same delay.

We consider the links between Ramsey theory in the integers, based on van der Waerden's theorem, and (boolean, CNF) SAT solving. We aim at using the problems from exact Ramsey theory, concerned with computing Ramsey-type numbers, as a rich source of test problems, where especially methods for solving hard problems can be developed. In order to control the growth of the problem instances, we introduce "transversal extensions" as a natural way of constructing mixed parameter tuples (k_1, ..., k_m) for van-der-Waerden-like numbers N(k_1, ..., k_m), such that the growth of these numbers is guaranteed to be linear. Based on Green-Tao's theorem we introduce the "Green-Tao numbers" grt(k_1, ..., k_m), which in a sense combine the strict structure of van der Waerden problems with the (pseudo-)randomness of the distribution of prime numbers. Using standard SAT solvers (look-ahead, conflict-driven, and local search) we determine the basic values. It turns out that already for this single form of Ramsey-type problems, when considering the best-performing solvers a wide variety of solver types is covered. For m > 2 the problems are non-boolean, and we introduce the "generic translation scheme", which offers an infinite variety of translations ("encodings") and covers the known methods. In most cases the special instance called "nested translation" proved to be far superior.

Based on the first linearized Boussinesq equation, the analytical solution of the transient groundwater model, which is used for describing phreatic flow in a semi-infinite aquifer bounded by a linear stream and subjected to time-dependent vertical seepage, is derived out by Laplace transform and the convolution integral. According to the mathematical characteristics of the solution, different methods for estimating aquifer parameters are constructed to satisfy different hydrological conditions. Then, the equation for estimating water exchange between stream and aquifer is proposed, and a recursion equation or estimating the intensity of phreatic evaporation is also proposed. A phreatic aquifer stream system located in Huaibei Plain, Anhui Province, China, is taken as an example to demonstrate the estimation process of the methods stated herein.

TY - JOURAU - Cui, YunanAU - Hudzik, HenrykAU - Zhang, TaoTI - On some geometric properties of certain Köthe sequence spacesJO - Mathematica BohemicaPY - 1999PB - Institute of Mathematics, Academy of Sciences of the Czech RepublicVL - 124IS - 2-3SP - 303EP - 314AB - It is proved that if a Kothe sequence space $X$ is monotone complete and has the weakly convergent sequence coefficient WCS$(X)>1$, then $X$ is order continuous. It is shown that a weakly sequentially complete Kothe sequence space $X$ is compactly locally uniformly rotund if and only if the norm in $X$ is equi-absolutely continuous. The dual of the product space $(\bigoplus \nolimits _i=1^\infty X_i)_\Phi $ of a sequence of Banach spaces $(X_i)_i=1^\infty $, which is built by using an Orlicz function $\Phi $ satisfying the $\Delta _2$-condition, is computed isometrically (i.e. the exact norm in the dual is calculated). It is also shown that for any Orlicz function $\Phi $ and any finite system $X_1,\dots ,X_n$ of Banach spaces, we have $\mathop WCS((\bigoplus \nolimits _i=1^nX_i)_\Phi )=\min \lbrace \mathop WCS(X_i) i=1,\dots ,n\rbrace $ and that if $\Phi $ does not satisfy the $\Delta _2$-condition, then WCS$((\bigoplus \nolimits _i=1^\infty X_i) _\Phi )=1$ for any infinite sequence $(X_i)$ of Banach __spaces.LA__ - engKW - Köthe sequence space; weakly convergent sequence coefficient; order continuity of the norm; absolute continuity of the norm; compact local uniform rotundity; Orlicz sequence space; Luxemburg norm; Orlicz norm; dual space; product space; Köthe sequence space; weakly convergent sequence coefficient; order continuity of the norm; absolute continuity of the norm; compact local uniform rotundity; Orlicz sequence space; Luxemburg norm; Orlicz norm; dual space; product spaceUR - ER -

The conventional paradigm of computation, embodied by the ideas of Churchand Turing and embedded in the stored-program (von Neumann) architecture,formulates the dichotomy of hardware and software. Hardware is the structureof the computer; it is immutable and tangible, like the Turing tape reader.Software is the purpose of the computer; it is volatile and intangible,like the bits of data on the infinite Turing tape. Excluded from this paradigmis the possibility of modifying the structure to suit the purpose. As aresult, contemporary computer architectures exhibit optimizations suchas multiple bus standards for different purposes, branch prediction, andcomplex memory coherency protocols for multiprocessor architectures. However,if hardware could be modified (reconfigured) to suit the purpose, one canimagine a computer with a single all-in-one expansion socket, and processorswould not need branch prediction or coherence protocols. Indeed, thanksto programmable logic technology, this traditional paradigm can be replacedwith something new. With programmable logic, structure can be modifiedfor a specific purpose, resulting in a dramatic increase of performance.A new genre of computer must be defined: a computer without a stored-programarchitecture that can be customized to any algorithm, including a TuringMachine. For this computer, the architecture becomes the program. Otherswho have studied such machines have referred to them as either virtualcomputers, custom computers, programmable active memories, functional memories,or transformable computers. I shall use the common term reconfigurablecomputer. 041b061a72