Fossil SCM

disk I/O error on commit (AIX)

Closed

93166ec5cc5b429… · opened 16 years, 1 month ago

Type
Code_Defect
Priority
Severity
Severe
Resolution
Not_A_Bug
Subsystem
Created
Feb. 24, 2010 9:06 p.m.

I have been getting a 'disk I/O error' when performing commits on an AIX system. The error prevents the autosync-push from taking place. A manual push works just fine.

After a fair bit of tracking down, I found it is caused by the fsync(fd) call in unixDelete().

DBA0035:/home/dba0035/code/fossil-scm.org >fsl commit -m "where is>
New_Version: ba79ed2621c11aad28ed30d6d3de9f922a5c445c
dt: unixDelete(1) rc:0 zPath:/home/dba0035/code/fossil-scm.org/FOSSIL-mj544AAF
2B fd:7
dt: unixDelete(2) rc:1290 errno:9
fossil: disk I/O error
COMMIT

If you have recently updated your fossil executable, you might need to run "fossil all rebuild" to bring the repository schemas up to date. DBA0035:/home/dba0035/code/fossil-scm.org >

Tracked using the following patch

--- src/sqlite3.c
+++ src/sqlite3.c
@@ -25516,20 +25516,18 @@
   unlink(zPath);
 #ifndef SQLITE_DISABLE_DIRSYNC
   if( dirSync ){
     int fd;
     rc = openDirectory(zPath, &fd);
+printf("dt: unixDelete(1) rc:%d zPath:%s fd:%d\n", rc, zPath, fd);
     if( rc==SQLITE_OK ){
 #if OS_VXWORKS
       if( fsync(fd)==-1 )
 #else
       if( fsync(fd) )
 #endif
       {
         rc = SQLITE_IOERR_DIR_FSYNC;
+printf("dt: unixDelete(2) rc:%d errno:%d\n", rc, errno);
       }
       if( close(fd)&&!rc ){
         rc = SQLITE_IOERR_DIR_CLOSE;
       }
     }

The errno of 9 is

DBA0035:/usr/include >grep EBADF *
errno.h:#define EBADF   9       / Bad file descriptor                  /

To get around this I have applied the following patch to bypass the dirSync processing... Not sure of the implications but nothing has broken yet!

--- sqlite3.c
+++ sqlite3.c
@@ -48918,13 +48918,12 @@
     }

 /* Delete the master journal file. This commits the transaction. After
 ** doing this the directory is synced again before any individual
 ** transaction files are deleted.
  • rc = sqlite3OsDelete(pVfs, zMaster, 1);
  • */
  • rc = sqlite3OsDelete(pVfs, zMaster, 0);
  • */
  • rc = sqlite3OsDelete(pVfs, zMaster, 1); sqlite3DbFree(db, zMaster); zMaster = 0; if( rc ){ return rc; }

drh added on 2010-02-24 21:51:43:
Seems like an easier fix is to simply recompile with the -DSQLITE_DISABLE_DIRSYNC compile-time option.


anonymous added on 2010-02-25 08:40:04:
Sometimes you cant see the answer for looking... :)

Recompiled as suggested and autosync now working beautifully!!! Suddenly fossil comes alive.

Autosync:  http://nnn.nnn.nnn.nnn:8080/
                Bytes      Cards  Artifacts     Deltas
Send:             130          1          0          0
Received:        1196         26          0          0
Total network traffic: 315 bytes sent, 831 bytes received
New_Version: 4d9585118174ba091078318d13942715b662c86b
Autosync:  http://nnn.nnn.nnn.nnn:8080/
                Bytes      Cards  Artifacts     Deltas
Send:            2763         31          1          2
Received:        1334         29          0          0
Total network traffic: 1713 bytes sent, 903 bytes received

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button