| PLEX86 | ||
|
Pausing a running operationThe Natural Philosopher How to write a Backup&Restore tool On Fri, 16 Jun 2006 01:09:10 -0700, Canopus I show one method for making a "snapshot" backup of a known set of files... Problem is that a Fortran master program of mine calls other slave programs during the course of its end via the nohup command. Sometimes these slave programs go haywire and need to be end. Killing them right off crashes the master program because it parses in the output of the slave programs based on the expected format of a 100% working run (number of rows, etc.) Config local and remote email Hi, What I'm wanting to do should be simple enough and common, or at least I thought. I've read... At the moment I just run a simple script that copies a perfectly formatted, dummy output file and kills the PID of the stalled process. It works about 50% of the time. Sometimes the master program detects termination of the slave program and tries to parse the output before the aforementioned script replaces the mal-formatted output file with the working dummy one (the slave program writes to the output file in real time). Or the slave program overwrites the dummy output file before it is end. At which point the whole thing crashes and I need to restart the run that has been going on for a few days. Freezing the stalling slave program, deleting its output file, replacing it and killing it would work. Changing the master program to be more robust to formatting errors would also work, but so much time has been spent on the same program already that I am wary of changing the basis of comparison by changing the program. As it is I'm being a pedant about keeping everything the same, including RNG seeds.
|
||||
How to write a Backup&Restore tool Linux groups from Newsgroups The #1 Usenet Provider on the Internet
|
||||