P42247 In search of halting mysteries link reply
Due to the halting theorem, it's not possible to write a program that determines if any given computer program halts. If a program does halt, then if we run it for a sufficiently long time, we can see that it halts, although sufficient time may not be available to us. Sometimes a program takes too long to be practical to run to the end, but it's nevertheless possible to prove logically that the program will halt eventually. For some programs that don't halt, we can't find that out by merely running the program, since any given program could just take a very long time to halt, but for many programs we can prove logically that they never halt. There will necessarily be some programs for which nobody knows (yet) whether they halt or not. What do the simplest such programs look like?

To make this concrete, we can pick a representation for programs (I'll choose untyped lambda calculus), a measure of the size of the programs, a way of enumerating over all programs, with the smallest coming first, and a method for evaluating the programs that may or may not halt (normal-order reduction makes sense for this purpose since it succeeds in reducing the term to normal form if any evaluation strategy does).

Rather than reinvent the wheel, or in other words, to be lazy, here's a Stack Overflow answer that enumerates through the closed lambda expressions:
https://stackoverflow.com/questions/21012577/what-is-an-algorithm-to-enumerate-lambda-terms/40581899#40581899
It's based on this paper:
https://arxiv.org/abs/1210.2610
Both the Python script and the paper are attached.
If there's a better-suited enumeration you would suggest, we could consider that too.

The question, then, is: What's the first program in this enumeration of programs for which nobody knows whether it halts or not?
P42269 link reply
Isn't halting from an era when a computer program would access a set data resource, either digital or hard memory, and continue the process until it literally got a halt program to turn off the machine, after which it would continue running endlessly?

Like, I think there are halt commands in existing programs where after a certain set of functions on data it's instructed to halt, which probably just corresponds to a function of releasing an output data segment from use in ram rather than actually stopping any function.

If you're talking about a single function on a block of data and that's the only thing in your registry to do, then you can just multiply the clockcyle time by the number of blocks of input data you have to process.

I don't really know anything about computers though, so maybe there is some big paradox I don't understand.

If the program has to run functions in chronological order on a single core, then couldn't you just multiply the number of functions times the clockcycle time?
P42270 link reply
until it literally got to a halt command*
P42271 link reply
Like, in the punch card computers, there was a piece of paper, cardboard, or wood you'd stick at the end of your data cards to halt the program.

That's the era this comes from and even then it doesn't make any sense to me, because there's literally a halt card at the end and you know how much time it takes.
P42288 link reply
P42269
simple example of a program that does not halt
P42289 link reply
whereas this would be a program that does halt
P42290 link reply
Another example of a program that does not halt. This program searches for a counterexample to Fermat's Last Theorem, and halts when it finds one. Since Fermat's Last Theorem is true, the program will run forever. Although the theorem is simple, it wasn't proven by mathematicians until 1994, unless Fermat's famous posthumously discovered note in the margins of a book was accurate.

s = 5
while True:
for n in range(3, s-2):
for a in range(1, s-n-1):
for b in range(1, s-n-a):
c = s-n-a-b
lhs_text = '{1}^{0} + {2}^{0}'.format(n,a,b)
rhs_text = '{1}^{0}'.format(n,c)
if a**n + b**n == c**n:
print(lhs_text + " = " + rhs_text + "we're done")
exit()
else:
print(lhs_text + " != " + rhs_text + ", keep trying")
s += 1

A caveat is that if you could somehow keep the machine running for unfathomable eons while preventing calculation errors from cosmic rays and other sources, the program would eventually halt either due to exhausting the computer's memory or reaching the largest number libgmp can represent. Strictly speaking, the halting problem (determining if a program halts) is really a mathematical problem about algorithms, in which we imagine that additional memory can be allocated as needed, without limits.
P42294 link reply
Here is an example of a program for which it is not (currently) known whether it halts or not. I don't know how to rewrite it in terms of lambda expressions, and there's a good chance it isn't the simplest such program though.

The program is written as a python function. n is assumed to be a positive integer.

def collatz(n):
while (n!=1):
if (n%2==0):
n = n//2
else:
n = n+3

This program computes the terms of Collatz's sequence and halts when it encounters 1. As it is not currently known whether the sequences reach 1 for all positive integers, it is not known whether the program will halt (although it will halt most of the time). The Collatz conjecture hasn't been proven to be unsolvable though.
P42295 link reply
P42247
Modified the Stack Overflow guy's script to make the output a bit more human readable. Now instead of showing De Bruijn indices it uses actual variable names. The variable names are numbers in bijective base-26 (1=a, 26=z, 27=aa, ...) so there can be arbitrarily many. Some output:

0 terms of size 0
1 terms of size 1
(λ a. a)
3 terms of size 2
(λ a. (λ b. b))
(λ a. (λ b. a))
(λ a. (a a))
14 terms of size 3
(λ a. (λ b. (λ c. c)))
(λ a. (λ b. (λ c. b)))
(λ a. (λ b. (λ c. a)))
(λ a. (λ b. (b b)))
(λ a. (λ b. (b a)))
(λ a. (λ b. (a b)))
(λ a. (λ b. (a a)))
(λ a. (a (λ b. b)))
(λ a. (a (λ b. a)))
(λ a. (a (a a)))
(λ a. ((λ b. b) a))
(λ a. ((λ b. a) a))
(λ a. ((a a) a))
((λ a. a) (λ a. a))

No halting mysteries yet. All but three are already normalized, and the remaining three only take one step:
(λ a. ((λ b. b) a)) = (λ a. a)
(λ a. ((λ b. a) a)) = (λ a. a)
((λ a. a) (λ a. a)) = (λ a. a)

An obvious infinite loop term is
((λ a. (a a)) (λ a. (a a))) = ((λ a. (a a)) (λ a. (a a)))
which is size 5. I don't know if it's the first / smallest yet.
P42299 link reply
P42294
One way to do arithmetic in lambda calculus is Church numerals. Basically you encode the natural number n as a function that takes another function as input and returns a function that repeats the original function n times. For example, 3 would be:
λ f. λ x. f (f (f x))
It's not a very efficient way of doing arithmetic, but it's pretty simple.

To fit the way I posted the question in OP, we'd also need to apply the Collatz conjecture to a specific number. Apparently it's been tested up to 2^68, so we'd have to pick a number greater than that for it to be a mystery. Although at that scale it would probably be a very short mystery since even at that scale the numbers I test reach 1 fairly quickly, usually in a few hundred steps or so.

Also, a correction:
>n = n+3
should be
n = 3*n+1
P42301 link reply
>What's the first program in this enumeration of programs for which nobody knows whether it halts or not?

You can not know if a program that accepts input will ever halt.

You also can not know if a program will halt if the calculation is likely infinite, such as calculating the largest prime that exists. We just don't know if it will halt, but it's likely it will never halt and the numbers will just keep getting larger and further apart from each other.

You can however make every program halt at some point, simply by setting a timeout in the form of computation / clock cycles / time / whatever. So, make it halt if you want it to halt but don't know if it will or will not.

I don't find this particular problem very interesting because logic dictates that it shouldn't be possible to determine if a particular program ever halts.
It's the "free energy" of computer science.
P42304 link reply
P42299
Oh, you don't want your program to take any argument? That's definitely not the simplest program then. Especially considering the fact that n needs to be pretty damn big for the program to take a significant time to finish. Its time complexity appears to be in O(log(n)) even for "bad" values, and it's faster than that to check whether n verifies the conjecture.
Thanks for the correction. That version obviously converged (a(n+2)<a(n) for a(n)>3), and has a second cycle at [6,3].

P42295
This faggot didn't share his modifications so here are mine, although the order is slightly different.
The script tries, in order, all the lambda expressions and display those that take longer than a certain time to terminate. It remains to determine whether those eventually terminate or not. I use lci (https://www.chatzi.org/lci/) to evaluate the lambda expressions.
Incidently, ((λ a. (a a)) (λ a. (a a))) is indeed the first that doesn't terminate. I don't really understand lambda calculus. Is it equivalent to
>def f():
> f()

?

Here are the expressions of size <=6 that don't terminate in less than one second:
0 terms of size 0
1 terms of size 1
3 terms of size 2
14 terms of size 3
82 terms of size 4
579 terms of size 5
((λ a. (a a)) (λ a. (a a)))
4741 terms of size 6
(λ a. ((λ b. (b b)) (λ b. (b b))))
((λ a. (a a)) (λ a. (λ b. (a a))))
((λ a. (a a)) (λ a. (a (a a))))
((λ a. (a a)) (λ a. ((a a) a)))
((λ a. (λ b. (a a))) (λ a. (a a)))
((λ a. (a (a a))) (λ a. (a a)))
((λ a. ((a a) a)) (λ a. (a a)))
P42305 link reply
P42301
>You can not know if a program that accepts input will ever halt.
What?
I'm pretty sure that
<def f(a,b):
< return a+b
does halt, although it may take an arbitrarily long time, and require an arbitrarily large amount of memory depending on the values of a and b. In any case OP's problem is about programs that don't take any input.

>You also can not know if a program will halt if the calculation is likely infinite
That's the point, I think. OP is wondering what the simplest such program is.

>I don't find this particular problem very interesting because logic dictates that it shouldn't be possible to determine if a particular program ever halts.
Logic dictates that there exists programs for which we can't determine whether they halt. It's not the same. There are tons of programs for which it's possible (and programmers usually try to write programs that terminate [spoiler: except for interactive programs])
<print("Hello world")
halts
<while True:
< print("Hello world")
doesn't

PS. Even in the case of interactive programs, given the series of user inputs, it should be possible to determine whether the program halts or not (and if it isn't, the programmer should kill themselves)
P42306 link reply
P42305
>>You can not know if a program that accepts input will ever halt.
>What?


>user keeps entering calculations into calculator program
>never exits the program

How do you determine this?
You don't and you can't.

>There are tons of programs for which it's possible
It's only possible by limiting the set of things you can do with a particular programming language (see Idris or Haskell or Coq) OR by limiting what you do within in that program but that's not what I'm talking about, I'm talking about arbitrary programs.

Try to write a program (also called "proof") that determines if another arbitrary program terminates or not WITHOUT executing it. It's not possible.
P42307 link reply
P42305
>OP is wondering what the simplest such program is.
Hmm I saw discussion on usenet about this a long time ago. I will check if I can find it because I remember someone posted something like this.
P42309 link reply
count.py
P42295
meant to attach the file, my bad
P42312 link reply
P42304
>Oh, you don't want your program to take any argument?
We could consider the case with input too, might be interesting, but it could be more complex. You wouldn't, for example, be able to run through the list and check if they terminate in less than one second because you have to test every input.

>I don't really understand lambda calculus. Is it equivalent to
>>def f():
>> f()
>?

To be more precise, it's equivalent to
def f(g):
return g(g)
f(f)
P42371 link reply
Another program whose halting status is unknown would be a search for a counterexample to Goldbach's conjecture. No inputs needed for this one.

#!/usr/bin/env python

def is_prime(p):
i = 2
while i * i <= p:
if p % i == 0:
return False
i += 1
return True

n = 4
while True:
for p in range(2, n//2+1):
if is_prime(p) and is_prime(n-p):
print('{} = {} + {}'.format(n, p, n-p))
break
else:
print('counterexample: {}'.format(n))
break
n += 2
P42372 link reply
P42304
It's pretty simple to express a very large number in lambda calculus. For example
(λ t. ((t t) t) t) (λ f. λ x. f (f (f x)))
is
3^(3^(3^3)).
P42376 link reply
timeout7.txt
P42304
>although the order is slightly different
It looks like you didn't take into account that the numbers were supposed to be De Bruijn indices instead of indexing the variables directly. But I don't think that changes the set of expressions that get iterated over, only their order.

>((λ a. (a a)) (λ a. (a a)))
>Is it equivalent to
>>def f():
>> f()
>?

I should add that it's kind of used as a substitute for that since lambda calculus doesn't support writing recursive functions directly. But you can write
(λ x. f (x x)) (λ x. f (x x))
which behaves as follows:
(λ x. f (x x)) (λ x. f (x x))
= f ((λ x. f (x x)) (λ x. f (x x)))
= f (f ((λ x. f (x x)) (λ x. f (x x))))
= f (f (f ((λ x. f (x x)) (λ x. f (x x)))))

>Here are the expressions of size <=6 that don't terminate in less than one second:
These ones are indeed all infinite loops. Computing them by hand:

λ a. [bold: (λ b. b b) (λ b. b b)]
= λ a. [bold: (λ b. b b) (λ b. b b)]

(λ a. a a) (λ a. λ b. a a)
= [bold: (λ a. λ b. a a) (λ a. λ b. a a)]
= λ c. [bold: (λ a. λ b. a a) (λ a. λ b. a a)]

(λ a. a a) (λ a. a (a a))
= [bold: (λ a. a (a a)) (λ a. a (a a))]
= (λ a. a (a a)) ((λ a. a (a a)) (λ a. a (a a)))
= ([bold: (λ a. a (a a)) (λ a. a (a a))]) (((λ a. a (a a)) (λ a. a (a a))) ((λ a. a (a a)) (λ a. a (a a))))

(λ a. a a) (λ a. (a a) a)
= [bold: (λ a. (a a) a) (λ a. (a a) a)]
= ([bold: (λ a. (a a) a) (λ a. (a a) a)]) (λ a. (a a) a)

(λ a. λ b. a a) (λ a. a a)
= λ b. [bold: (λ a. a a) (λ a. a a)]
= λ b. [bold: (λ a. a a) (λ a. a a)]

(λ a. a (a a)) (λ a. a a)
= (λ a. a a) ((λ a. a a) (λ a. a a))
= ([bold: (λ a. a a) (λ a. a a)]) ((λ a. a a) (λ a. a a))
= ([bold: (λ a. a a) (λ a. a a)]) ((λ a. a a) (λ a. a a))

(λ a. (a a) a) (λ a. a a)
= ([bold: (λ a. a a) (λ a. a a)]) (λ a. a a)
= ([bold: (λ a. a a) (λ a. a a)]) (λ a. a a)

I ran your script and file attached is what I got for the size 7 terms that timed out. I haven't investigated them much yet.
P42384 link reply
P42307
Ah it was a discussion about the halting problem in general but not about the simplest program. Anyways I'll post it. Seemed interesting but probably it's just schizo ramblings.

>Gödel's_proof_relies_on_self-contradictory_expressions_of_language
https://novabbs.org/computers/article-flat.php?id=47153&group=comp.theory

>Simulating (partial) Halt Deciders [updated paper]
https://novabbs.org/computers/article-flat.php?id=47217&group=comp.theory

>Simulating (partial) Halt Deciders Defeat the Halting Problem Proofs
https://novabbs.org/computers/article-flat.php?id=46401&group=comp.theory
P42632 link reply
Of the size-7 terms known to halt, this one takes the longest at 11 steps:
(λ a. a a) (λ a. a (a (λ b. b)))

Abbreviating λ b. b as I:
(λ a. a a) (λ a. a (a I))
= (λ a. a (a I)) (λ a. a (a I))
= (λ a. a (a I)) ((λ a. a (a I)) I)
= ((λ a. a (a I)) I) (((λ a. a (a I)) I) I)
= (I (I I)) (((λ a. a (a I)) I) I)
= (I I) (((λ a. a (a I)) I) I)
= I (((λ a. a (a I)) I) I)
= ((λ a. a (a I)) I) I
= (I (I I)) I
= (I I) I
= I I
= I

The reduction goes a bit faster if not done in normal order:
(λ a. a a) (λ a. a (a I))
= (λ a. a (a I)) (λ a. a (a I))
= (λ a. a (a I)) ((λ a. a (a I)) I)
= (λ a. a (a I)) (I (I I))
= (λ a. a (a I)) (I I)
= (λ a. a (a I)) I
= I (I I)
= I I
= I

I haven't yet confirmed non-termination for the 81 size-7 terms not known to terminate, but they all take longer than 100 steps to get there.
P42638 link reply
count.py
count.txt
Current code and the results for up to size 7 attached. I've added an evaluator to the Python script, and the results show the terms that don't reduce in a timely manner, and what happens to them in the first 101 reduction steps. I've run it up to size 9, but the output is too large to attach. Here are the counts:

0 terms of size 0
0 terms of size 0 exceeded size 10000 in 100 steps
0 terms of size 0 not reduced in 100 steps
1 terms of size 1
0 terms of size 1 exceeded size 10000 in 100 steps
0 terms of size 1 not reduced in 100 steps
3 terms of size 2
0 terms of size 2 exceeded size 10000 in 100 steps
0 terms of size 2 not reduced in 100 steps
14 terms of size 3
0 terms of size 3 exceeded size 10000 in 100 steps
0 terms of size 3 not reduced in 100 steps
82 terms of size 4
0 terms of size 4 exceeded size 10000 in 100 steps
0 terms of size 4 not reduced in 100 steps
579 terms of size 5
0 terms of size 5 exceeded size 10000 in 100 steps
1 terms of size 5 not reduced in 100 steps
4741 terms of size 6
0 terms of size 6 exceeded size 10000 in 100 steps
7 terms of size 6 not reduced in 100 steps
43977 terms of size 7
0 terms of size 7 exceeded size 10000 in 100 steps
81 terms of size 7 not reduced in 100 steps
454283 terms of size 8
1 terms of size 8 exceeded size 10000 in 100 steps
747 terms of size 8 not reduced in 100 steps
5159441 terms of size 9
39 terms of size 9 exceeded size 10000 in 100 steps
7943 terms of size 9 not reduced in 100 steps

For size 10 I run into memory issues and will have to adjust the generating code.
P42673 link reply
How do I even get smart enough to understand this? Sure, I've used lambdas before, but lambda calculus still seem so weird. Any suggestions?
P42684 link reply
P42673
You might be as smart as the rest of us already. The language, especially without types to hint at what's going on, is a Turing tarpit. It's not like I can look at these things and just intuit what they'll do; I need to work out the steps.
P42715 link reply
count.py
count.txt
I found a bug in the order my script was evaluating the lambda terms. Wikipedia explains the difference between what I was doing and normal order evaluation:

https://en.wikipedia.org/wiki/Reduction_strategy#Lambda_calculus
>Leftmost reduction is sometimes used to refer to normal order reduction, as with a pre-order traversal the notions coincide, and similarly the leftmost-outermost redex is the redex with leftmost starting character when the lambda term is considered as a string of characters.[13][14] When "leftmost" is defined using an in-order traversal the notions are distinct. For example, in the term (λ x. x Ω) (λ y. I) with Ω, I defined here [https://en.wikipedia.org/wiki/Lambda_calculus#Standard_terms], the leftmost redex of the in-order traversal is Ω while the leftmost-outermost redex is the entire expression.[15]

In the above, Ω is (λ a. a a) (λ a. a a), the smallest lambda term that goes into an infinite loop rather than eventually reducing to a normal form, and I is the identify function λ a. a.

Up to size 8, the lambda terms that I was not able to confirm as normalizing (that is, reaching normal form and coming to a halt) remain the same.] There's an increase in the number of terms that explode in size rather than just hanging. There are three size-9 terms that reach normal form under the intended normal-order evaluation but not the old bugged one:
((λ a. (a a)) (λ a. (λ b. (b ((a a) (λ c. b))))))
((λ a. (a a)) (λ a. (λ b. (b ((a a) (λ c. a))))))
((λ a. ((a a) (λ b. a))) (λ a. (λ b. (b (a a)))))
The term given in the Wikipedia example, which also reaches normal form in normal-order evaluation but not in what I was doing previously, is of size 10.

I've also started looking for ways to identify provably hung programs. The newest version of the script identifies any term where the reduction of the entire term goes in a loop. Of the size-7 terms, this is 53 out of 81 of the programs not known to halt by reaching normal form. The remaining 28 terms are:

(λ a. ((λ b. (b b)) (λ b. (λ c. (b b)))))
(λ a. ((λ b. (b b)) (λ b. (b (b b)))))
(λ a. ((λ b. (b b)) (λ b. (a (b b)))))
(λ a. ((λ b. (b b)) (λ b. ((b b) b))))
(λ a. ((λ b. (b b)) (λ b. ((b b) a))))
((λ a. (a a)) (λ a. (λ b. (λ c. (a a)))))
((λ a. (a a)) (λ a. (λ b. (b (a a)))))
((λ a. (a a)) (λ a. (λ b. (a (a a)))))
((λ a. (a a)) (λ a. (a (λ b. (a b)))))
((λ a. (a a)) (λ a. (a (a (λ b. a)))))
((λ a. (a a)) (λ a. (a (a (a a)))))
((λ a. (a a)) (λ a. (a ((λ b. b) a))))
((λ a. (a a)) (λ a. (a ((λ b. a) a))))
((λ a. (a a)) (λ a. (a ((a a) a))))
((λ a. (a a)) (λ a. ((a a) (λ b. b))))
((λ a. (a a)) (λ a. ((a a) (λ b. a))))
((λ a. (a a)) (λ a. ((a a) (a a))))
((λ a. (a a)) (λ a. ((a (a a)) a)))
((λ a. (a a)) (λ a. (((a a) a) a)))
((λ a. (λ b. (a a))) (λ a. (λ b. (a a))))
((λ a. (λ b. (a a))) (λ a. (a (a a))))
((λ a. (λ b. (a a))) (λ a. ((a a) a)))
((λ a. (a (a a))) (λ a. (λ b. (a a))))
((λ a. (a (a a))) (λ a. (a (a a))))
((λ a. (a (a a))) (λ a. ((a a) a)))
((λ a. ((a a) a)) (λ a. (λ b. (a a))))
((λ a. ((a a) a)) (λ a. (a (a a))))
((λ a. ((a a) a)) (λ a. ((a a) a)))

I'll be looking for slightly more sophisticated ways to identify infinite loops, such as the cases where the reduced term has the original as a subterm. But I'm going to need to be careful about it. Here's an example of a term that reduces to something including itself as a subterm before reducing to normal form:

(λ a. (λ b. λ c. c) (a a)) (λ a. (λ b. λ c. c) (a a))
= (λ b. λ c. c) ((λ a. (λ b. λ c. c) (a a)) (λ a. (λ b. λ c. c) (a a)))
= λ c. c

Here are the new statistics up to size 9. I've attached the output showing reductions of the terms up to size 7.

0 terms of size 0

1 terms of size 1

3 terms of size 2

14 terms of size 3

82 terms of size 4

579 terms of size 5
steps exceeded: 1
known loop: 1

4741 terms of size 6
steps exceeded: 7
known loop: 4
unknown: 3

43977 terms of size 7
steps exceeded: 80
known loop: 53
unknown: 28
size exceeded: 1

454283 terms of size 8
steps exceeded: 735
known loop: 369
unknown: 379
size exceeded: 13

5159441 terms of size 9
steps exceeded: 7756
known loop: 3799
unknown: 4180
size exceeded: 223
P42768 link reply
P42673
To me the weirdest thing about the lambda calculus is that while normally a programming language would have both data and functions, lambda calculus only has functions. So there have to be hacks to simulate data with functions. The most common seems to be to pick some function that would normally consume that data type (an "eliminator" function), and emulate that data type with functions that would behave the same as the eliminator function called with the data as an argument.

For example, with the boolean datatype, there are normally two values, true and false, and there's an eliminator function ifthenelse that behaves as follows:

ifthenelse true A B = A
ifthenelse false A B = B

What we do in the lambda calculus is define true as what we would call "ifthenelse true" in a language with separate data and functions.

true A B = A
false A B = B

Or written as lambda terms:

true = λ a. λ b. a
false = λ a. λ b. b
P42769 link reply
P42673
fr. The last time I've used lambda was in ancient greek class.
P42771 link reply
To honor this site's name, fagmin should rewrite this site in lambda calculus and have it be interpreted by a javascript engine.
P42855 link reply
P42715
>I'll be looking for slightly more sophisticated ways to identify infinite loops, such as the cases where the reduced term has the original as a subterm. But I'm going to need to be careful about it.

I found another term that looks like it ought to loop but doesn't.

(λ a. a a) (λ a. λ b. b ((a a) (λ c. b)))

Letting F = λ a. λ b. b ((a a) (λ c. b)), it reduces as follows:
(λ a. a a) F
= F F
= λ b. b ((F F) (λ c. b)) <- notice how F F appears within its own reduct and is the next redex to reduce
= λ b. b ((λ d. d ((F F) (λ c. b))) (λ c. b))
= λ b. b ((λ c. b) ((F F) (λ c. b)))
= λ b. b b

There's also a second term like this of the same size (both are size 9):
(λ a. a a) (λ a. λ b. b ((a a) (λ c. a)))
Letting G = λ a. λ b. b ((a a) (λ c. a)),
(λ a. a a) G
= G G
= λ b. b ((G G) (λ c. G))
= λ b. b ((λ d. d ((G G) (λ c. G))) (λ c. G))
= λ b. b ((λ c. G) ((G G) (λ c. G)))
= λ b. b G
P42892 link reply
count.py
count.txt
P42855
Alright, I think I've accounted for this now. The issue is that when the process is repeated, the F F found within the reduction of the previous F F need not be the leftmost-outermost redex since the previous F F expands to something starting with a λ and it's being applied to something.

Latest code and results attached. The summary:

0 terms of size 0

1 terms of size 1

3 terms of size 2

14 terms of size 3

82 terms of size 4

579 terms of size 5
steps exceeded: 1
known loop: 1

4741 terms of size 6
steps exceeded: 7
known loop: 7

43977 terms of size 7
steps exceeded: 80
known loop: 76
unknown: 5
size exceeded: 1

454283 terms of size 8
steps exceeded: 735
known loop: 693
unknown: 55
size exceeded: 13

5159441 terms of size 9
steps exceeded: 7756
known loop: 7102
unknown: 877
size exceeded: 223

The terms of size 7 for which the script was not able to figure out whether reduction halts are:
((λ a. (a a)) (λ a. (λ b. (b (a a)))))
((λ a. (a a)) (λ a. (a (λ b. (a b)))))
((λ a. (a a)) (λ a. (a (a (λ b. a)))))
((λ a. (a a)) (λ a. (a ((λ b. b) a))))
((λ a. (a a)) (λ a. (a ((λ b. a) a))))
P42896 link reply
count.py
count.txt
Okay, looking at how this guy reduces:
>(λ a. a a) (λ a. λ b. b (a a))

[Let F = λ a. λ b. b (a a)]
(λ a. a a) F
= F F
= λ b. b (F F)
my fix for the issue in P42855 could be a little narrower. This one can't have the problem since the second F F is the argument.

New code and output attached. New summary:

0 terms of size 0

1 terms of size 1

3 terms of size 2

14 terms of size 3

82 terms of size 4

579 terms of size 5
steps exceeded: 1
known loop: 1

4741 terms of size 6
steps exceeded: 7
known loop: 7

43977 terms of size 7
steps exceeded: 80
known loop: 77
unknown: 4
size exceeded: 1

454283 terms of size 8
steps exceeded: 735
known loop: 702
unknown: 46
size exceeded: 13

5159441 terms of size 9
steps exceeded: 7756
known loop: 7241
unknown: 738
size exceeded: 223

The terms of size 7 for which the script was not able to figure out whether reduction halts are now:
((λ a. (a a)) (λ a. (a (λ b. (a b)))))
((λ a. (a a)) (λ a. (a (a (λ b. a)))))
((λ a. (a a)) (λ a. (a ((λ b. b) a))))
((λ a. (a a)) (λ a. (a ((λ b. a) a))))
P42935 link reply
P42896
Let's look at the remaining terms of size 7.

>(λ a. a a) (λ a. a ((λ b. b) a))
>(λ a. a a) (λ a. a ((λ b. a) a))

By choosing a different beta-reduction than what normal-order evaluation would tell us to do, these can both be reduced to
(λ a. a a) (λ a. a a)
which we already known doesn't have a normal form. Since beta-reduction is confluent,
https://en.wikipedia.org/wiki/Church%E2%80%93Rosser_theorem
that means these don't have normal forms either.

>(λ a. a a) (λ a. a (λ b. a b))
This one is clearly an eta-expansion of
(λ a. a a) (λ a. a a).
I would expect that to mean this one doesn't have a (beta) normal form either. I had to do some reading for this one. I didn't find a theorem explicitly saying that eta-conversion preserves the property of having or not having a beta normal form, but it follows trivially from two theorems I did find: A term has a beta-eta normal form if and only if it has a beta normal form, and the combination of beta and eta reduction is confluent.

The remaining term is more interesting looking.

(λ a. a a) (λ a. a (a (λ b. a)))

Let's see how it behaves when we reduce it using the usual normal-order evaluation.
Let
F = λ a. a (a (λ b. a)).

We have
(λ a. a a) F
= F F
= F (F (λ b. F)).

Let
F2 = F (λ b. F)
and note that it reduces to
F2
= (λ b. F) ((λ b. F) (λ b. λ c. F))
= F.

So now the reduction of our term looks like this:
(λ a. a a) F
= F F
= F F2
= F2 (F2 (λ b. F2))
= F (F2 (λ b. F2))

Let
F3 = F2 (λ b. F2).
It also reduces to
F3
= F (λ b. F2)
= (λ b. F2) ((λ b. F2) (λ b. λ c. F2))
= F2
= F.

Now we've got this:
(λ a. a a) F
= F F
= F F2
= F2 F3
= F F3
= F3 (F3 (λ b. F3))
= F (F3 (λ b. F3))

We can see where this is going now. Let
F_1 = F
F_{n+1} = F_n (λ b. F_n)

We can prove by induction that every F_n reduces to F.
Base case: F_1 is already F.
Assume as an inductive hypothesis: F_n reduces to F. Then
F_{n+1}
= F_n (λ b. F_n)
= F (λ b. F_n)
= (λ b. F_n) ((λ b. F_n) (λ b. λ c. F_n))
= F_n
= F.

We'll also need to know how that
F F_n
= F_n (F_n (λ b. F_n))
= F_n F_{n+1}.

We can now see that the reduction will go like this:
(λ a. a a) F
= F F
= F F2
= F2 F3
= F F3
= F3 F4
= F F4
= F4 F5
...

So this one is also an infinite loop.

There's probably a simpler way to see this. I'm going to have to read a bit more about what makes normal-order evaluation guaranteed to find the normal form if it exists, and what variants I can use that are likewise guaranteed to find the normal form.
P43006 link reply
P42935
>There's probably a simpler way to see this.
I found it. Apparently there's what's known as the quasi-leftmost reduction theorem, which says that if a term has a normal form, there is no infinite sequence of reduction steps of which infinitely many reduce the leftmost-outermost regex.

Again letting
F = λ a. a (a (λ b. a)),
we can now proceed as follows:

(λ a. a a) F
= F F
= F (F (λ b. F))
= F ((λ b. F) ((λ b. F) (λc. (λ b. F))))
= F F
...

Since the reduction
F F
= F (F (λ b. F))
reduces the leftmost-outermost regex (in this case F F), and this happens infinitely many times, this is incompatible with the term having a normal form.
P43714 link reply
Let's try from the other end. How small of a lambda expression can we write to, for example, check if a number's Collatz sequence reaches 1?

Wikipedia lists a function to divide Church numbers, but it's really long, so I'll use my own. Basically the idea is to start with the pair
>(false, x)
and repeat the following function n times:
>(a, b) -> (not a, if a then f b else b)
which will result in the pair
>(is_odd(n), n/2).

We can encode a pair as a function that operates on another function by passing it two values. In other words,
>pair a b
can be encoded as
>λ g. g a b.

When written out, this looks like:
>n (λ g. g (λ a. λ b. λ h. h (λ u. λ v. a v u) (a (f b) b))) (λ g. g (λ u. λ v. v) x)

Then we just need to take the result and do
>(a, b) -> if a then 3 (n f) (f x) else b
and of course take f and x as inputs to make the result into a Church numeral.

Our function for computing the next step in the Collatz sequence is then:
>λ n. λ f. λ x. n (λ g. g (λ a. λ b. λ h. h (λ u. λ v. a v u) (a (f b) b))) (λ g. g (λ u. λ v. v) x) (λ a. λ b. a (3 (n f) (f x)) b)
It makes sense to define 3 elsewhere since we'll also be using it to form the number we're going to use as input.

We can make this a bit shorter. Instead of operating on the pairs, start with the function
>λ a. λ b. a (3 (n f) (f x)) b
that will make use of the two parts of the pair in the end. Then repeatedly operate on the function, changing g to
>λ a. λ b. g (λ u. λ v. a v u) (a (f b) b).
After repeating this n times, we take the final function and pass it the initial values of the two parts of the pair,
>λ u. λ v. v
and
>x.

Putting it all together, we obtain a shorter function for computing the next step
>λ n. λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. λ b. a (3 (n f) (f x)) b) (λ u. λ v. v) x
which we can further eta-reduce to:
>λ n. λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. a (3 (n f) (f x))) (λ u. λ v. v) x

Now we just need to use the Y trick to implement the recursion
>collatz_loop(n) = if n <= 1 then n else collatz_loop(collatz_step(n)).

But comparing n with 1 will be a bit longer than we'd like. It's simpler to do a comparison with 0. So let's modify the sequence so everything is shifted down by 1.
>n even -> n/2
>n odd -> 3n+1

becomes
>n odd -> floor(n/2)
>n even -> 3n+3.


This changes the stepping function to:
>λ n. λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. λ b. a b (3 (n f) ((3 f) x))) (λ u. λ v. v) x

We can get it a bit shorter by making the division part compute (is_even(n), n/2), because the eta reduction works again:
>λ n. λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a b (f b))) (λ a. λ b. a (3 (n f) ((3 f) x)) b) (λ u. λ v. u) x
>λ n. λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a b (f b))) (λ a. a (3 (n f) ((3 f) x))) (λ u. λ v. u) x


Now to get the recursion working. We want to define a function R such that
>R R n
reduces to
>if n == 0 then n else R R (collatz_step n)
so we define R like this:
>λ r. λ n. if n == 0 then n else r r (collatz_step n)
With the zero test written out, this becomes:
>λ r. λ n. n (λ m. r r (collatz_step n)) n
And now plugging in our stepping function:
>λ r. λ n. n (λ m. r r (λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. λ b. a b (3 (n f) ((3 f) x))) (λ u. λ v. v) x)) n

To get the loop function we want, we need to apply R to itself:
>(λ z. z z) (λ r. λ n. n (λ m. r r (λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. λ b. a b (3 (n f) ((3 f) x))) (λ u. λ v. v) x)) n)

Now all we need is an enormous initial argument so we don't know if it halts. Let's try
>3^(3^(3^3))
which given our offset of 1 means the number we're really testing as a starting point is
>3^(3^(3^3)) + 1.
We can write 3^(3^(3^3)) in lambda calculus as
>((3 3) 3) 3
where 3 is given by
λ f. λ x. f (f (f x)).

Putting it all together, we have:
>(λ t. (λ z. z z) (λ r. λ n. n (λ m. r r (λ f. λ x. n (λ g. λ a. λ b. g (λ u. λ v. a v u) (a (f b) b)) (λ a. λ b. a b (t (n f) ((t f) x))) (λ u. λ v. v) x)) n) (((t t) t) t)) (λ f. λ x. f (f (f x)))

This is a term of size 50, and I don't know whether its reduction process halts or not, and it's possible nobody currently does, unless someone has a clever argument for why the Collatz sequence starting with 3^(3^(3^3)) + 1 should reach 1.
P52827 link reply
On the Turing side of Church-Turing, there's something called the Busy Beaver function which asks how many 1s an n-state Turing machine can print to a tape provided the machine eventually halts. Attempts to determine the values of this function have led people to check the halting status of Turing machines with increasing numbers of states, and at 5 states there are Turing machines whose halting status remains a mystery:

https://en.wikipedia.org/wiki/Busy_beaver
>The current (as of 2023) 5-state busy beaver champion produces 4098 1s, using 47176870 steps (discovered by Heiner Marxen and Jürgen Buntrock in 1989), but there remain many machines with non-regular behavior which are believed to never halt, but which have not been proven to run infinitely. Various sources list different numbers of these holdouts. Skelet lists 42 or 43 unproven machines.
https://skelet.ludost.net/bb/nreg.html

There's been some investigation into Busy Beaver equivalents for lambda calculus:
https://oeis.org/A333479
P52833 link reply
is halting of a program the same as exit
P52834 link reply
P52833
Yes.
P53011 link reply
You're saying the program that can check if another program exited or can exit or stopped can't be wri*****?
P53019 link reply
P53011
Suppose we have a subroutine halts such that

halts(program, {arg1, arg2, ..., argn}) = true if program(arg1, arg2, ... argn) halts
halts(program, {arg1, arg2, ..., argn}) = false if program(arg1, arg2, ... argn) continues running forever

Then we can write the following subroutine:

function f(prog)
if halts(prog, {prog}) then
while true do
end
else
return
end
end

Now consider calling f(f).
If halts(f, {f}) returns true, then f(f) goes into the infinite while loop. But the specification of halts says that halts(f, {f}) should return false in this situation.
If halts(f, {f}) returns false, then f(f) halts. But the specification of halts says that halts(f, {f}) should return true in this situation.

Thus there is no way to write an accurate version of the halts subroutine.

Knowing what we can't do, let's talk about what we can do. We can write a program that returns accurate halting results for some programs. Most simply, for programs that halt, we have our halting check program simply run the program in question until it halts. But for programs that don't halt, we may never know for sure that they run forever (on an idealized computer as described in P42290; obviously real computers break down at some point). For some programs, it's obvious that they're stuck in a loop, but for others, you can't distinguish if the program is really going to run forever or whether it will terminate but just hasn't yet.
P54866 link reply
The halting theorem states that there is no algorithm that can accurately determine whether a given program halts or not. Therefore, is there something other than an algorithm that could determine that? More generally, is there a function that cannot be computed by an algorithm that can be computed some other way?
[spoiler:What is an algorithm? Am I an algorithm?]
P55016 link reply
P54866
An algorithm is something that can be run on a Turing machine, or equivalently, can be programmed in lambda calculus. So far nobody knows of any real thing that can compute things that Turing machines can't.
>Am I an algorithm?
maybe
x