Fossil SCM

Updates to the Robot Defense Settings page to make it easier to configure the latest defensive options.

drh 2025-10-09 12:51 tarball-robot-defense
Commit 7f846263708c7766f9bff1c7f72c53e43491bf2c39423b23e31155fd91a18ad1
1 file changed +24 -16
+24 -16
--- src/setup.c
+++ src/setup.c
@@ -470,12 +470,12 @@
470470
@ <p>A Fossil website can have billions of pages in its tree, even for a
471471
@ modest project. Many of those pages (examples: diffs and tarballs)
472472
@ might be expensive to compute. A robot that tries to walk the entire
473473
@ website can present a crippling CPU and bandwidth load.
474474
@
475
- @ <p>The settings on this page are intended to help site administrators
476
- @ defend the site against robots.
475
+ @ <p>The settings on this page are intended to help administrators
476
+ @ defend against abusive robots.
477477
@
478478
@ <form action="%R/setup_robot" method="post"><div>
479479
login_insert_csrf_secret();
480480
@ <input type="submit" name="submit" value="Apply Changes"></p>
481481
@ <hr>
@@ -482,43 +482,51 @@
482482
@ <p><b>Do not allow robots access to these pages.</b><br>
483483
@ If the page name matches the GLOB pattern of this setting, and the
484484
@ users is "nobody", and the client has not previously passed a captcha
485485
@ test to show that it is not a robot, then the page is not displayed.
486486
@ A captcha test is is rendered instead.
487
- @ The recommended value for this setting is:
487
+ @ The default value for this setting is:
488488
@ <p>
489489
@ &emsp;&emsp;&emsp;<tt>%h(robot_restrict_default())</tt>
490490
@ <p>
491491
@ The "diff" tag covers all diffing pages such as /vdiff, /fdiff, and
492492
@ /vpatch. The "annotate" tag covers /annotate and also /blame and
493493
@ /praise. The "zip" covers itself and also /tarball and /sqlar. If a
494494
@ tag has an "X" character appended, then it only applies if query
495
- @ parameters are such that the page is particularly difficult to compute.
495
+ @ parameters are such that the page is expensive and/or unusual.
496496
@ In all other case, the tag should exactly match the page name.
497497
@
498498
@ To disable robot restrictions, change this setting to "off".
499499
@ (Property: robot-restrict)
500500
@ <br>
501501
textarea_attribute("", 2, 80,
502502
"robot-restrict", "rbrestrict", robot_restrict_default(), 0);
503503
504
- @ <hr>
505
- @ <p><b>Exceptions to anti-robot restrictions</b><br>
506
- @ The entry below is a list of
507
- @ <a href="%R/re_rules">regular expressions</a>, one per line.
508
- @ If any of these regular expressions match the input URL, then the
509
- @ request is exempt from anti-robot defenses. Use this, for example,
510
- @ to allow scripts to download release tarballs using a pattern
511
- @ like:</p>
512
- @ <p>
513
- @ &emsp;&emsp;<tt>^/tarball/(version-[0-9.]+|release)/</tt>
514
- @ <p>The pattern should match against the REQUEST_URI with the
504
+ @ <p><b>Exception #1</b><br>
505
+ @ If "zipX" appears in the robot-restrict list above, then tarballs,
506
+ @ ZIP-archives, and SQL-archives may be downloaded by robots if
507
+ @ the check-in is a leaf (robot-zip-leaf):<br>
508
+ onoff_attribute("Allow tarballs for leaf check-ins",
509
+ "robot-zip-leaf", "rzleaf", 0, 0);
510
+
511
+ @ <p><b>Exception #2</b><br>
512
+ @ If "zipX" appears in the robot-restrict list above, then tarballs,
513
+ @ ZIP-archives, and SQL-archives may be downloaded by robots if
514
+ @ the check-in has one or more tags that match the following
515
+ @ list of GLOB patterns: (robot-zip-tag)<br>
516
+ textarea_attribute("", 2, 80,
517
+ "robot-zip-tag", "rztag", "", 0);
518
+
519
+ @ <p><b>Exception #3</b><br>
520
+ @ If the request URI matches any of the following
521
+ @ <a href="%R/re_rules">regular expressions</a> (one per line), then the
522
+ @ request is exempt from anti-robot defenses.
523
+ @ The regular expression is matched against the REQUEST_URI with the
515524
@ SCRIPT_NAME prefix removed, and with QUERY_STRING appended following
516525
@ a "?" if QUERY_STRING exists. (Property: robot-exception)<br>
517526
textarea_attribute("", 3, 80,
518527
"robot-exception", "rbexcept", "", 0);
519
-
520528
@ <hr>
521529
addAutoHyperlinkSettings();
522530
523531
@ <hr>
524532
entry_attribute("Anonymous Login Validity", 11, "anon-cookie-lifespan",
525533
--- src/setup.c
+++ src/setup.c
@@ -470,12 +470,12 @@
470 @ <p>A Fossil website can have billions of pages in its tree, even for a
471 @ modest project. Many of those pages (examples: diffs and tarballs)
472 @ might be expensive to compute. A robot that tries to walk the entire
473 @ website can present a crippling CPU and bandwidth load.
474 @
475 @ <p>The settings on this page are intended to help site administrators
476 @ defend the site against robots.
477 @
478 @ <form action="%R/setup_robot" method="post"><div>
479 login_insert_csrf_secret();
480 @ <input type="submit" name="submit" value="Apply Changes"></p>
481 @ <hr>
@@ -482,43 +482,51 @@
482 @ <p><b>Do not allow robots access to these pages.</b><br>
483 @ If the page name matches the GLOB pattern of this setting, and the
484 @ users is "nobody", and the client has not previously passed a captcha
485 @ test to show that it is not a robot, then the page is not displayed.
486 @ A captcha test is is rendered instead.
487 @ The recommended value for this setting is:
488 @ <p>
489 @ &emsp;&emsp;&emsp;<tt>%h(robot_restrict_default())</tt>
490 @ <p>
491 @ The "diff" tag covers all diffing pages such as /vdiff, /fdiff, and
492 @ /vpatch. The "annotate" tag covers /annotate and also /blame and
493 @ /praise. The "zip" covers itself and also /tarball and /sqlar. If a
494 @ tag has an "X" character appended, then it only applies if query
495 @ parameters are such that the page is particularly difficult to compute.
496 @ In all other case, the tag should exactly match the page name.
497 @
498 @ To disable robot restrictions, change this setting to "off".
499 @ (Property: robot-restrict)
500 @ <br>
501 textarea_attribute("", 2, 80,
502 "robot-restrict", "rbrestrict", robot_restrict_default(), 0);
503
504 @ <hr>
505 @ <p><b>Exceptions to anti-robot restrictions</b><br>
506 @ The entry below is a list of
507 @ <a href="%R/re_rules">regular expressions</a>, one per line.
508 @ If any of these regular expressions match the input URL, then the
509 @ request is exempt from anti-robot defenses. Use this, for example,
510 @ to allow scripts to download release tarballs using a pattern
511 @ like:</p>
512 @ <p>
513 @ &emsp;&emsp;<tt>^/tarball/(version-[0-9.]+|release)/</tt>
514 @ <p>The pattern should match against the REQUEST_URI with the
 
 
 
 
 
 
 
 
 
515 @ SCRIPT_NAME prefix removed, and with QUERY_STRING appended following
516 @ a "?" if QUERY_STRING exists. (Property: robot-exception)<br>
517 textarea_attribute("", 3, 80,
518 "robot-exception", "rbexcept", "", 0);
519
520 @ <hr>
521 addAutoHyperlinkSettings();
522
523 @ <hr>
524 entry_attribute("Anonymous Login Validity", 11, "anon-cookie-lifespan",
525
--- src/setup.c
+++ src/setup.c
@@ -470,12 +470,12 @@
470 @ <p>A Fossil website can have billions of pages in its tree, even for a
471 @ modest project. Many of those pages (examples: diffs and tarballs)
472 @ might be expensive to compute. A robot that tries to walk the entire
473 @ website can present a crippling CPU and bandwidth load.
474 @
475 @ <p>The settings on this page are intended to help administrators
476 @ defend against abusive robots.
477 @
478 @ <form action="%R/setup_robot" method="post"><div>
479 login_insert_csrf_secret();
480 @ <input type="submit" name="submit" value="Apply Changes"></p>
481 @ <hr>
@@ -482,43 +482,51 @@
482 @ <p><b>Do not allow robots access to these pages.</b><br>
483 @ If the page name matches the GLOB pattern of this setting, and the
484 @ users is "nobody", and the client has not previously passed a captcha
485 @ test to show that it is not a robot, then the page is not displayed.
486 @ A captcha test is is rendered instead.
487 @ The default value for this setting is:
488 @ <p>
489 @ &emsp;&emsp;&emsp;<tt>%h(robot_restrict_default())</tt>
490 @ <p>
491 @ The "diff" tag covers all diffing pages such as /vdiff, /fdiff, and
492 @ /vpatch. The "annotate" tag covers /annotate and also /blame and
493 @ /praise. The "zip" covers itself and also /tarball and /sqlar. If a
494 @ tag has an "X" character appended, then it only applies if query
495 @ parameters are such that the page is expensive and/or unusual.
496 @ In all other case, the tag should exactly match the page name.
497 @
498 @ To disable robot restrictions, change this setting to "off".
499 @ (Property: robot-restrict)
500 @ <br>
501 textarea_attribute("", 2, 80,
502 "robot-restrict", "rbrestrict", robot_restrict_default(), 0);
503
504 @ <p><b>Exception #1</b><br>
505 @ If "zipX" appears in the robot-restrict list above, then tarballs,
506 @ ZIP-archives, and SQL-archives may be downloaded by robots if
507 @ the check-in is a leaf (robot-zip-leaf):<br>
508 onoff_attribute("Allow tarballs for leaf check-ins",
509 "robot-zip-leaf", "rzleaf", 0, 0);
510
511 @ <p><b>Exception #2</b><br>
512 @ If "zipX" appears in the robot-restrict list above, then tarballs,
513 @ ZIP-archives, and SQL-archives may be downloaded by robots if
514 @ the check-in has one or more tags that match the following
515 @ list of GLOB patterns: (robot-zip-tag)<br>
516 textarea_attribute("", 2, 80,
517 "robot-zip-tag", "rztag", "", 0);
518
519 @ <p><b>Exception #3</b><br>
520 @ If the request URI matches any of the following
521 @ <a href="%R/re_rules">regular expressions</a> (one per line), then the
522 @ request is exempt from anti-robot defenses.
523 @ The regular expression is matched against the REQUEST_URI with the
524 @ SCRIPT_NAME prefix removed, and with QUERY_STRING appended following
525 @ a "?" if QUERY_STRING exists. (Property: robot-exception)<br>
526 textarea_attribute("", 3, 80,
527 "robot-exception", "rbexcept", "", 0);
 
528 @ <hr>
529 addAutoHyperlinkSettings();
530
531 @ <hr>
532 entry_attribute("Anonymous Login Validity", 11, "anon-cookie-lifespan",
533

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button