Chinaunix

标题: 【转】用正则匹配素数的写法 [打印本页]

作者: x9x9 时间: 2011-12-27 11:43
标题: 【转】用正则匹配素数的写法
在微博里看到的，感觉思路很有趣，转过来：

perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'

复制代码

原文链接：http://www.catonmat.net/blog/per ... ches-prime-numbers/

作者: zhlong8 时间: 2011-12-27 11:53
前半部分不说了，后半部分是 a x b 的意思吧， (11+?) 是 a \1+ 重复的次数是 b - 1 加上前面那次就是 a x b。今天才看懂啊

作者: zhlong8 时间: 2011-12-27 12:00
本帖最后由 zhlong8 于 2011-12-27 12:00 编辑

11+? 从2开始试，\1+ 加号保证 b >= 2

作者: x9x9 时间: 2011-12-27 12:15
回复 3# zhlong8

其实用到的正则方面的知识并不多。但是思路很奈寻味。

让人想起胡适先生的一首《梦与诗》，呵呵：

都是平常经验
都是平常影象
偶然涌到梦中来
变幻出多少新奇花样

都是平常情感
都是平常言语
偶然碰着个诗人
变幻出多少新奇诗句

醉过才知酒浓
爱过才知情重 ——
你不能做我的诗
正如我不能做你的梦

作者: seufy88 时间: 2011-12-27 13:35
回复 4# x9x9

好诗好诗

作者: sellie 时间: 2011-12-27 16:37
看懂的人能不能详细解释一下,用通俗简单易懂一些的方式. 帮助一下我这样的有点迟钝的人. 谢谢

作者: 3P主义 时间: 2011-12-27 17:11

my $string = "-" x 1000;
$string =~ s/(?!(-{2,})\1{1,}$)(?=(-{2,}))/print length($2), "\n"/eg;

复制代码

1000 以内素数.

作者: zhlong8 时间: 2011-12-27 18:56

sellie 发表于 2011-12-27 16:37
看懂的人能不能详细解释一下,用通俗简单易懂一些的方式. 帮助一下我这样的有点迟钝的人. 谢谢

我下面跟的那两贴就是给看不懂的人解释的啊，你要不懂回溯那就要补基础了。总之就是素数不能拆成两个大于2的数的乘积

作者: kk861123 时间: 2011-12-27 19:07
回复 4# x9x9

好诗，耐人寻味

作者: 羲之遗韵 时间: 2011-12-28 09:16
好诗好代码！

作者: 3P主义 时间: 2011-12-28 09:30

鬼佬的写法没有哥的好。

作者: minirain 时间: 2015-12-08 20:45

Here is how it works.

First, the number is converted in its unary representation by (1x$_). For example, the number 5 gets converted into 1x5, which is 11111 (1 repeated 5 times.)

Next, the unary string gets tested against the regular expression. If it matches, the number is a composite, otherwise it's a prime.

The regular expression works this way. It consists of two parts ^1?$ and ^(11+?)\1+$.

The first part matches number 1 and the empty string. Clearly, the empty string and number 1 are not prime numbers, therefore this regular expression matches, which indicates that they are not prime numbers.

The second part determines if two or more 1s repeatedly make up the whole number. If two or more 1s repeatedly make up the whole number, the regex matches, which means that the number is composite. Otherwise it's a prime.

Let's look at the second regex part on numbers 5 and 4.

The number 5 in unary representation is 11111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the whole regex now becomes ^11(11)+$. It can't match five ones, therefore it fails. But since it used +?, it backtracks and matches the first three ones 111. The back-reference becomes 111 and the whole regex becomes ^111(111)+$. It doesn't match again. This repeats for 1111 and 11111, which also don't match, therefore the whole regex doesn't match and the number is a prime.

The number 4 in unary representation is 1111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the regex becomes ^11(11)+$. It matches the original string, therefore the number is not a prime.

PS. I didn't invent this regular expression, it was invented in 1998 by Abigail.

Don't take this regular expression too seriously, it's actually neither a regular expression (as defined in automata theory), nor a way to check if a number is a prime. It's just an awesome thing that Perl can do. See this cool article called The Prime That Wasn't by Andrei Zmievski for a discussion about how this regex fails for larger numbers because of backtracking.

Also if you wish to learn more about Perl one-liners, check out my Perl One-Liners Explained article series and download the perl1line.txt file.

Tweet

欢迎光临 Chinaunix (http://bbs.chinaunix.net/)