<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=GetLostFiles</id>
	<title>GetLostFiles - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=GetLostFiles"/>
	<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=GetLostFiles&amp;action=history"/>
	<updated>2026-05-16T12:38:35Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://t2bwiki.iihe.ac.be/index.php?title=GetLostFiles&amp;diff=103&amp;oldid=prev</id>
		<title>Maintenance script: Created page with &quot; == Retrieve lost files from datasets == PageOutline ---- === Introduction === Some files in datasets are not copied correctly to the T2&#039;s. It is at most a few files p...&quot;</title>
		<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=GetLostFiles&amp;diff=103&amp;oldid=prev"/>
		<updated>2015-08-26T12:28:31Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; == Retrieve lost files from datasets == &lt;a href=&quot;/index.php?title=PageOutline&amp;amp;action=edit&amp;amp;redlink=1&quot; class=&quot;new&quot; title=&quot;PageOutline (page does not exist)&quot;&gt;PageOutline&lt;/a&gt; ---- === Introduction === Some files in datasets are not copied correctly to the T2&amp;#039;s. It is at most a few files p...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
== Retrieve lost files from datasets ==&lt;br /&gt;
[[PageOutline]]&lt;br /&gt;
----&lt;br /&gt;
=== Introduction ===&lt;br /&gt;
Some files in datasets are not copied correctly to the T2&amp;#039;s. It is at most a few files per dataset. A few scripts are in place to easily put them where they belong.&lt;br /&gt;
&lt;br /&gt;
=== Identify ===&lt;br /&gt;
First, Go to [https://cms-popularity.cern.ch/popdb/popularity/corruptedFiles the corrupted files page] and see if there are any files from our site. If so, click on the &amp;quot;Get source JSON&amp;quot; tab&lt;br /&gt;
&lt;br /&gt;
=== create the necessary files ===&lt;br /&gt;
&lt;br /&gt;
Put the download file in /user/odevroed/Get_Lost_Files&amp;lt;br&amp;gt;&lt;br /&gt;
Edit the file and remove all the html tags. This leaves only the json part. For further reference in this document, I called it cms-popularity.cern.ch.json&amp;lt;br&amp;gt;&lt;br /&gt;
Run the script a first time (with option 1 :) ):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
python /user/odevroed/bin/download_missing_dataset_files.py 1 ./cms-popularity.cern.ch.json&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
This will generate the file &amp;quot;files_to_retrieve&amp;quot; which contains only the files from our site that are not present (In case it was just a coincidence).&amp;lt;br&amp;gt;&lt;br /&gt;
Go to this file and via [https://cmsweb.cern.ch/das/ DAS], find out at which site they reside. Usually you can aggregate many files from the list, present at one single site.&amp;lt;br&amp;gt;&lt;br /&gt;
Put this list together in a new file, let&amp;#039;s say &amp;quot;files_per_site&amp;quot; and give this as an argument for the second pass of the script (replace by the appropriate site name):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
python /user/odevroed/bin/download_missing_dataset_files.py 2 T2_CH_CSCS ./files_per_site&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The result of the script is the output:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fetching from site:&lt;br /&gt;
srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat&lt;br /&gt;
&lt;br /&gt;
The files are ready to be transferred&lt;br /&gt;
First, copy the files to us:&lt;br /&gt;
     srmcp -debug -streams_num=1 -2 -copyjobfile=transfer_to_us&lt;br /&gt;
Then, put them on storage:&lt;br /&gt;
     srmcp -debug -streams_num=1 -2 -copyjobfile=put_on_storage&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
then do what the script told you to do :) &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{TracNotice|{{PAGENAME}}}}&lt;/div&gt;</summary>
		<author><name>Maintenance script</name></author>
	</entry>
</feed>