Artwork

Contenuto fornito da HPR Volunteer and Hacker Public Radio. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da HPR Volunteer and Hacker Public Radio o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Player FM - App Podcast
Vai offline con l'app Player FM !

HPR4341: Transferring Large Data Sets

 
Condividi
 

Manage episode 472968897 series 108988
Contenuto fornito da HPR Volunteer and Hacker Public Radio. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da HPR Volunteer and Hacker Public Radio o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Transferring Large Data Sets Very large data sets present their own problems. Not everyone has directories with hundreds of gigabytes of project files, but I do, and I assume I'm not the only one. For instance, I have a directory with over 700 radio shows, many of these directories also have a podcast, and they also have pictures and text files. Doing a properties check on the directory I see 450 gigabytes of data. When I started envisioning Libre Indie Archive I wanted to move the directories into archival storage using optical drives. My first attempt at this didn't work because I lost metadata when I wrote the optical drives since optical drives are read only. After further work and study I learned that tar files can preserve meta data if they are created and uncompressed as root. In fact, if you are running tar as root preserving file ownership and permissions is the default. So this means that optical drives are an option if you write tar archives onto the optical drives. I have better success rates with 25 GB Blue Ray Discs than with the 50 GB discs. So, if your directory breaks up into projects that fit on 25 GB discs, that's great. My data did not do this easily but tar does have an option to write a data set to multiple tar files each with a maximum size, labelling them -0 -1, etc. When using this multi volume feature you cannot use compression. So you will get tar files, not tar.gz files. It's better to break the file sets up in more reasonable sizes so I decided to divide the shows up alphabetically by title, so all the shows starting with the letter a would be one data set and then down the alphabet, one letter at a time. Most of the letters would result in a single tar file labeled -0 that would fit on the 25 GB disc. Many letters, however, took two or even three tar files that would have to be written on different disks and then concatenated on the primary system before they are extracted to the correct location in primaryfiles. There is a companion program to tar, called tarcat, that I used to combine 2 or 3 tar files split by length into a single tar file that could be extracted. I ran engrampa as root to extract the files. So, I used a tar command on the working system where my Something Blue radio shows are stored. Then I used K3b to burn these files onto a 25 GB Blu Ray Disc carefully labeling the discs and writing a text file that I used to keep up with which files I had already copied to Disc. Then on the Libre Indie Archive primary system I copied from the Blu Ray to the boot drive the file or files for that data set. Then I would use tarcat to combine the files if there was more than one file for that data set. And finally I would extract the files to primaryfiles by running engrampa as root. Now I'm going to go into details on each of these steps. First make sure that the Libre Indie Archive program, prep.sh, is in your home directory on your workstation. Then from the data directory to be archived, in my case the something_blue directory run prep.sh like this. ~/prep.sh This will create a file named IA_Origin.txt that lists the date, the computer and directory being archived, and the users and userids on that system. All very helpful information to have if at some time in the future you need to do a restore. Next create a tar data set for each letter of the alphabet. (You may want to divide y
  continue reading

4377 episodi

Artwork
iconCondividi
 
Manage episode 472968897 series 108988
Contenuto fornito da HPR Volunteer and Hacker Public Radio. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da HPR Volunteer and Hacker Public Radio o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Transferring Large Data Sets Very large data sets present their own problems. Not everyone has directories with hundreds of gigabytes of project files, but I do, and I assume I'm not the only one. For instance, I have a directory with over 700 radio shows, many of these directories also have a podcast, and they also have pictures and text files. Doing a properties check on the directory I see 450 gigabytes of data. When I started envisioning Libre Indie Archive I wanted to move the directories into archival storage using optical drives. My first attempt at this didn't work because I lost metadata when I wrote the optical drives since optical drives are read only. After further work and study I learned that tar files can preserve meta data if they are created and uncompressed as root. In fact, if you are running tar as root preserving file ownership and permissions is the default. So this means that optical drives are an option if you write tar archives onto the optical drives. I have better success rates with 25 GB Blue Ray Discs than with the 50 GB discs. So, if your directory breaks up into projects that fit on 25 GB discs, that's great. My data did not do this easily but tar does have an option to write a data set to multiple tar files each with a maximum size, labelling them -0 -1, etc. When using this multi volume feature you cannot use compression. So you will get tar files, not tar.gz files. It's better to break the file sets up in more reasonable sizes so I decided to divide the shows up alphabetically by title, so all the shows starting with the letter a would be one data set and then down the alphabet, one letter at a time. Most of the letters would result in a single tar file labeled -0 that would fit on the 25 GB disc. Many letters, however, took two or even three tar files that would have to be written on different disks and then concatenated on the primary system before they are extracted to the correct location in primaryfiles. There is a companion program to tar, called tarcat, that I used to combine 2 or 3 tar files split by length into a single tar file that could be extracted. I ran engrampa as root to extract the files. So, I used a tar command on the working system where my Something Blue radio shows are stored. Then I used K3b to burn these files onto a 25 GB Blu Ray Disc carefully labeling the discs and writing a text file that I used to keep up with which files I had already copied to Disc. Then on the Libre Indie Archive primary system I copied from the Blu Ray to the boot drive the file or files for that data set. Then I would use tarcat to combine the files if there was more than one file for that data set. And finally I would extract the files to primaryfiles by running engrampa as root. Now I'm going to go into details on each of these steps. First make sure that the Libre Indie Archive program, prep.sh, is in your home directory on your workstation. Then from the data directory to be archived, in my case the something_blue directory run prep.sh like this. ~/prep.sh This will create a file named IA_Origin.txt that lists the date, the computer and directory being archived, and the users and userids on that system. All very helpful information to have if at some time in the future you need to do a restore. Next create a tar data set for each letter of the alphabet. (You may want to divide y
  continue reading

4377 episodi

Tất cả các tập

×
 
Loading …

Benvenuto su Player FM!

Player FM ricerca sul web podcast di alta qualità che tu possa goderti adesso. È la migliore app di podcast e funziona su Android, iPhone e web. Registrati per sincronizzare le iscrizioni su tutti i tuoi dispositivi.

 

Guida rapida

Ascolta questo spettacolo mentre esplori
Riproduci