pages tagged howtorohieb.namehttps://rohieb.name/blag/tag/howto/rohieb.nameikiwiki2023-06-09T08:12:10ZOptimizing XSane's scanned PDFs (also: PDF internals)https://rohieb.name/blag/post/optimizing-xsane-s-scanned-pdfs/rohieb
CC-BY-SA 3.0
2023-06-09T08:12:10Z2013-11-17T22:58:35Z
<h2 id="problem">Problem</h2>
<p>I use <a href="http://www.xsane.org/" title="XSane homepage">XSane</a> to scan documents for my digital archive. I want them to be
in PDF format and have a reasonable resolution (better than 200 dpi, so I
can try OCRing them afterwards). However, the PDFs created by XSane’s multipage
mode are too large, about 250 MB for a 20-page document scanned at
200 dpi.</p>
<table class="img"><caption>XSane’s Multipage mode</caption><tr><td><a href="https://rohieb.name/blag/post/optimizing-xsane-s-scanned-pdfs/xsane-multipage-mode.png"><img src="https://rohieb.name/blag/post/optimizing-xsane-s-scanned-pdfs/x200-xsane-multipage-mode.png" width="223" height="200" class="img" /></a></td></tr></table>
<h2 id="firstnon-optimalsolution">First (non-optimal) solution</h2>
<p>At first, I tried to optimize the PDF using <a href="http://ghostscript.com" title="Ghostscript homepage">GhostScript</a>. I
<a href="https://rohieb.name/blag/post/use-ghostscript-to-convert-pdf-files/">already wrote</a> about how GhostScript’s
<code>-dPDFSETTINGS</code> option can be used to minimize PDFs by redering the pictures to
a smaller resolution. In fact, there are <a href="http://milan.kupcevic.net/ghostscript-ps-pdf/#refs" title="Ghostscript PDF Reference & Tips">multiple rendering modes</a>
(<code>screen</code> for 96 dpi, <code>ebook</code> for 150 dpi, <code>printer</code> for 300 dpi,
and <code>prepress</code> for color-preserving 300 dpi), but they are pre-defined, and
for my 200 dpi images, <code>ebook</code> was not enough (I would lose resolution),
while <code>printer</code> was too high and would only enlarge the PDF.</p>
<h2 id="interlude:pdfinternals">Interlude: PDF Internals</h2>
<p>The best thing to do was to find out how the images were embedded in the PDF.
Since most PDF files are also partly human-readable, I opened my file with vim.
(Also, I was surprised that <a href="https://rohieb.name/blag/tag/howto/vim-syntax-highlighting.png">vim has syntax highlighting for
PDF</a>.) Before we continue, I'll give a short
introduction to the PDF file format (for the long version, see <a href="http://partners.adobe.com/public/developer/en/pdf/PDFReference.pdf" title="Adobe Portable Document Format, Version 1.4">Adobe’s PDF
reference</a>).</p>
<h3 id="buildingblocks">Building Blocks</h3>
<p>Every PDF file starts with the <a href="https://en.wikipedia.org/wiki/Magic_number_(programming)#Magic_numbers_in_files" title="Wikipedia: Magic numbers in files">magic string</a> that identifies the version
of the standard which the document conforms to, like <code>%PDF-1.4</code>. After that, a
PDF document is made up of the following objects:</p>
<dl>
<dt>Boolean values</dt>
<dd>
<code>true</code> and <code>false</code>
</dd>
<dt>Integers and floating-point numbers</dt>
<dd>
for example, <code>1337</code>, <code>-23.42</code> and <code>.1415</code>
</dd>
<dt>Strings</dt>
<dd>
<ul>
<li>interpreted as literal characters when enclosed in parentheses: <code>(This
is a string.)</code> These can contain escaped characters, particularly
escaped closing braces and control characters: <code>(This string contains a
literal \) and some\n newlines.\n)</code>.</li>
<li>interpreted as hexadecimal data when enclosed in angled brackets:
<code><53 61 6D 70 6C 65></code> equals <code>(Sample)</code>.</li>
</ul>
</dd>
<dt>Names</dt>
<dd>
starting with a forward slash, like <code>/Type</code>. You can think of them like
identifiers in programming languages.
</dd>
<dt>Arrays</dt>
<dd>
enclosed in square brackets:
<code>[ -1 4 6 (A String) /AName [ (strings in arrays in arrays!) ] ]</code>
</dd>
<dt>Dictionaries</dt>
<dd>
key-value stores, which are enclosed in double angled brackets. The key must
be a name, the value can be any object. Keys and values are given in turns,
beginning with the first key:
<code><< /FirstKey (First Value) /SecondKey 3.14 /ThirdKey /ANameAsValue >></code>
Usually, the first key is <code>/Type</code> and defines what the dictionary actually
describes.
</dd>
<dt>Stream Objects</dt>
<dd>
a collection of bytes. In contrast to strings, stream objects are usually
used for large amount of data which may not be read entirely, while strings
are always read as a whole. For example, streams can be used to embed images
or metadata.
</dd>
<dd>
Streams consist of a dictionary, followed by the keyword <code>stream</code>, the raw
content of the stream, and the keyword <code>endstream</code>. The dictionary describes
the stream’s length and the filters that have been applied to it, which
basically define the encoding the data is stored in. For example, data
streams can be compressed with various algorithms.
</dd>
<dt>The Null Object</dt>
<dd>
Represented by the literal string <code>null</code>.
</dd>
<dt>Indirect Objects</dt>
<dd>
Every object in a PDF document can also be stored as a indirect object,
which means that it is given a label and can be used multiple times in the
document. The label consists of two numbers, a positive <em>object number</em>
(which makes the object unique) and a non-negative <em>generation number</em>
(which allows to incrementally update objects by appending to the file).
</dd>
<dd>
Indirect objects are defined by their object number, followed by their
generation number, the keyword <code>obj</code>, the contents of the object, and the
keyword <code>endobj</code>. Example: <code>1 0 obj (I'm an object!) endobj</code> defines the
indirect object with object number 1 and generation number 0, which consists
only of the string “I'm an object!”. Likewise, more complex data structures
can be labeled with indirect objects.
</dd>
<dd>
Referencing an indirect object works by giving the object and generation
number, followed by an uppercase R: <code>1 0 R</code> references the object created
above. References can be used everywhere where a (direct) object could be
used instead.
</dd>
</dl>
<p>Using these object, a PDF document builds up a tree structure, starting from the
root object, which has the object number 1 and is a dictionary with the value
<code>/Catalog</code> assigned to the key <code>/Type</code>. The other values of this dictionary
point to the objects describing the outlines and pages of the document, which in
turn reference other objects describing single pages, which point to objects
describing drawing operations or text blocks, etc.</p>
<h3 id="dissectingthepdfscreatedbyxsane">Dissecting the PDFs created by XSane</h3>
<p>Now that we know how a PDF document looks like, we can go back to out initial
problem and try to find out why my PDF file was so huge. I will walk you through
the PDF object by object.</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl ppc">%PDF-1.4</span>
<span class="hl kwa">1 0 obj</span>
<span class="hl kwb"><<</span> <span class="hl kwc">/Type /Catalog</span>
<span class="hl kwc">/Outlines</span> <span class="hl kwa">2 0 R</span>
<span class="hl kwc">/Pages</span> <span class="hl kwa">3 0 R</span>
<span class="hl kwb">>></span>
<span class="hl kwa">endobj</span>
</pre></div>
<p>This is just the magic string declaring the document as PDF-1.4, and the root
object with object number 1, which references objects number 2 for Outlines and
number 3 for Pages. We're not interested in outlines, let's look at the pages.</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl kwa">3 0 obj</span>
<span class="hl kwb"><<</span> <span class="hl kwc">/Type /Pages</span>
<span class="hl kwc">/Kids</span> <span class="hl kwb">[</span>
<span class="hl kwa">6 0 R</span>
<span class="hl kwa">8 0 R</span>
<span class="hl kwa">10 0 R</span>
<span class="hl kwa">12 0 R</span>
<span class="hl kwb">]</span>
<span class="hl kwc">/Count</span> <span class="hl num">4</span>
<span class="hl kwb">>></span>
<span class="hl kwa">endobj</span>
</pre></div>
<p>OK, apparently this document has four pages, which are referenced by objects
number 6, 8, 10 and 12. This makes sense, since I scanned four pages ;-)</p>
<p>Let's start with object number 6:</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl kwa">6 0 obj</span>
<span class="hl kwb"><<</span> <span class="hl kwc">/Type /Page</span>
<span class="hl kwc">/Parent</span> <span class="hl kwa">3 0 R</span>
<span class="hl kwc">/MediaBox</span> <span class="hl kwb">[</span><span class="hl num">0 0 596 842</span><span class="hl kwb">]</span>
<span class="hl kwc">/Contents</span> <span class="hl kwa">7 0 R</span>
<span class="hl kwc">/Resources</span> <span class="hl kwb"><<</span> <span class="hl kwc">/ProcSet</span> <span class="hl kwa">8 0 R</span> <span class="hl kwb">>></span>
<span class="hl kwb">>></span>
<span class="hl kwa">endobj</span>
</pre></div>
<p>We see that object number 6 is a page object, and the actual content is in
object number 7. More redirection, yay!</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl kwa">7 0 obj</span>
<span class="hl kwb"><<</span> <span class="hl kwc">/Length</span> <span class="hl num">2678332</span> <span class="hl kwb">>></span>
<span class="hl str">stream</span>
<span class="hl str">q</span>
<span class="hl str">1 0 0 1 0 0 cm</span>
<span class="hl str">1.000000 0.000000 -0.000000 1.000000 0 0 cm</span>
<span class="hl str">595.080017 0 0 841.679993 0 0 cm</span>
<span class="hl str">BI</span>
<span class="hl str"> /W 1653</span>
<span class="hl str"> /H 2338</span>
<span class="hl str"> /CS /G</span>
<span class="hl str"> /BPC 8</span>
<span class="hl str"> /F /FlateDecode</span>
<span class="hl str">ID</span>
<span class="hl str">x$¼[$;¾åù!fú¥¡aæátq.4§ [ ...byte stream shortened... ]</span>
<span class="hl str">EI</span>
<span class="hl str">Q</span>
<span class="hl str">endstream</span>
<span class="hl kwa">endobj</span>
</pre></div>
<p>Aha, here is where the magic happens. Object number 7 is a stream object of
2,678,332 bytes (about 2 MB) and contains drawing operations! After skipping
around a bit in Adobe’s PDF reference (chapters 3 and 4), here is the annotated
version of the stream content:</p>
<div class="highlight-pdf"><pre class="hl">q <span class="hl slc">% Save drawing context</span>
<span class="hl num">1 0 0 1 0 0</span> cm <span class="hl slc">% Set up coordinate space for image</span>
<span class="hl num">1.000000 0.000000 -0.000000 1.000000 0 0</span> cm
<span class="hl num">595.080017 0 0 841.679993 0 0</span> cm
BI <span class="hl slc">% Begin Image</span>
<span class="hl kwc">/W</span> <span class="hl num">1653</span> <span class="hl slc">% Image width is 1653 pixel</span>
<span class="hl kwc">/H</span> <span class="hl num">2338</span> <span class="hl slc">% Image height is 2338 pixel</span>
<span class="hl kwc">/CS /G</span> <span class="hl slc">% Color space is Gray</span>
<span class="hl kwc">/BPC</span> <span class="hl num">8</span> <span class="hl slc">% 8 bits per pixel</span>
<span class="hl kwc">/F /FlateDecode</span> <span class="hl slc">% Filters: data is Deflate-compressed</span>
ID <span class="hl slc">% Image Data follows:</span>
x$¼<span class="hl kwb">[</span>$;¾åù!fú¥¡aæátq<span class="hl num">.4</span>§ <span class="hl kwb">[</span> ...byte stream shortened... <span class="hl kwb">]</span>
EI <span class="hl slc">% End Image</span>
Q <span class="hl slc">% Restore drawing context</span>
</pre></div>
<p>So now we know why the PDF was so huge: the line <code>/F /FlateDecode</code> tells us that
the image data is stored losslessly with <a href="https://en.wikipedia.org/wiki/DEFLATE" title="Wikipedia: DEFLATE algorithm">Deflate</a> compression (this is
basically what PNG uses). However, scanned images, as well as photographed
pictures, have the tendency to become very big when stored losslessly, due to te
fact that image sensors always add noise from the universe and lossless
compression also has to take account of this noise. In contrast, lossy
compression like JPEG, which uses <a href="http://en.wikipedia.org/wiki/Discrete_cosine_transform" title="Wikipedia: Discrete cosine transform">discrete cosine transform</a>, only has to
approximate the image (and therefore the noise from the sensor) to a certain
degree, therefore reducing the space needed to save the image. And the PDF
standard also allows image data to be DCT-compressed, by adding <code>/DCTDecode</code> to
the filters.</p>
<h2 id="secondsolution:useabettercompressionalgorithm">Second solution: use a (better) compression algorithm</h2>
<p>Now that I knew where the problem was, I could try to create PDFs with DCT
compression. I still had the original, uncompressed <a href="https://en.wikipedia.org/wiki/Netpbm_format" title="Wikipedia: Netpbm format">PNM</a> files that fell out
of XSane’ multipage mode (just look in the multipage project folder), so I
started to play around a bit with <a href="http://www.imagemagick.org" title="ImageMagic homepage">ImageMagick’s</a> <code>convert</code> tool, which can
also convert images to PDF.</p>
<h3 id="convertingpnmtopdf">Converting PNM to PDF</h3>
<p>First, I tried converting the umcompressed PNM to PDF:</p>
<pre><code>$ convert image*.pnm document.pdf
</code></pre>
<p><code>convert</code> generally takes parameters of the form <code>inputfile outputfile</code>, but it
also allows us to specify more than one input file (which is somehow
undocumented in the <a href="http://manpages.debian.net/cgi-bin/man.cgi?query=convert" title="man convert(1)">man page</a>). In that case it tries to create
multi-page documents, if possible. With PDF as output format, this results in
one input file per page.</p>
<p>The embedded image objects looked somewhat like the following:</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl kwa">8 0 obj</span>
<span class="hl kwb"><<</span>
<span class="hl kwc">/Type /XObject</span>
<span class="hl kwc">/Subtype /Image</span>
<span class="hl kwc">/Name /Im0</span>
<span class="hl kwc">/Filter</span> <span class="hl kwb">[</span> <span class="hl kwc">/RunLengthDecode</span> <span class="hl kwb">]</span>
<span class="hl kwc">/Width</span> <span class="hl num">1653</span>
<span class="hl kwc">/Height</span> <span class="hl num">2338</span>
<span class="hl kwc">/ColorSpace</span> <span class="hl kwa">10 0 R</span>
<span class="hl kwc">/BitsPerComponent</span> <span class="hl num">8</span>
<span class="hl kwc">/Length</span> <span class="hl kwa">9 0 R</span>
<span class="hl kwb">>></span>
<span class="hl str">stream</span>
<span class="hl str">% [ raw byte data ]</span>
<span class="hl str">endstream</span>
</pre></div>
<p>The filter <code>/RunLengthDecode</code> indicates that the stream data is compressed with
<a href="https://en.wikipedia.org/wiki/Run-length_encoding" title="Wikipedia: Run-length encoding">Run-length encoding</a>, another simple lossless compression. Not what I
wanted. (Apart from that, <code>convert</code> embeds images as XObjects, but there is not
much difference to the inline images described above.)</p>
<h3 id="convertingpnmtojpgthentopdf">Converting PNM to JPG, then to PDF</h3>
<p>Next, I converted the PNMs to JPG, then to PDF.</p>
<pre><code>$ convert image*.pnm image.jpg
$ convert image*jpg document.pdf
</code></pre>
<p>(The first command creates the output files <code>image-1.jpg</code>, <code>image-2.jpg</code>, etc.,
since JPG does not support multiple pages in one file.)</p>
<p>When looking at the PDF, we see that we now have DCT-compressed images inside
the PDF:</p>
<div class="highlight-pdf"><pre class="hl"><span class="hl kwa">8 0 obj</span>
<span class="hl kwb"><<</span>
<span class="hl kwc">/Type /XObject</span>
<span class="hl kwc">/Subtype /Image</span>
<span class="hl kwc">/Name /Im0</span>
<span class="hl kwc">/Filter</span> <span class="hl kwb">[</span> <span class="hl kwc">/DCTDecode</span> <span class="hl kwb">]</span>
<span class="hl kwc">/Width</span> <span class="hl num">1653</span>
<span class="hl kwc">/Height</span> <span class="hl num">2338</span>
<span class="hl kwc">/ColorSpace</span> <span class="hl kwa">10 0 R</span>
<span class="hl kwc">/BitsPerComponent</span> <span class="hl num">8</span>
<span class="hl kwc">/Length</span> <span class="hl kwa">9 0 R</span>
<span class="hl kwb">>></span>
<span class="hl str">stream</span>
<span class="hl str">% [ raw byte data ]</span>
<span class="hl str">endstream</span>
</pre></div>
<h3 id="convertingpnmtojpgthentopdfandfixpagesize">Converting PNM to JPG, then to PDF, and fix page size</h3>
<p>However, the pages in <code>document.pdf</code> are 82.47×58.31 cm, which results in
about 72 dpi in respect to the size of the original images. But <code>convert</code>
also allows us to specify the pixel density, so we'll set that to 200 dpi
in X and Y direction, which was the resolution at which the images were scanned:</p>
<pre><code>$ convert image*jpg -density 200x200 document.pdf
</code></pre>
<p><em>Update:</em> You can also use the <a href="http://www.imagemagick.org/script/command-line-options.php#page" title="ImageMagick: Command-line Options"><code>-page</code> parameter</a> to set the page size
directly. It takes a multitude of predefined paper formats (see link) and will
do the pixel density calculation for you, as well as adding any neccessary
offset if the image ratio is not quite exact:</p>
<pre><code>$ convert image*jpg -page A4 document.pdf
</code></pre>
<p>With that approach, I could reduce the size of my PDF from 250 MB with
losslessly compressed images to 38 MB with DCT compression.</p>
<p><em>Another update (2023):</em> Marcus notified me that it is possible to use
ImageMagick's <code>-compress jpeg</code> option, this way we can leave out the
intermediate step and convert PNM to PDF directly:</p>
<pre><code>$ convert image*.pnm -compress jpeg -quality 85 output.pdf
</code></pre>
<p>You can also play around with the <code>-quality</code> parameter to set the JPEG
compression level (100% makes almost pristine, but huge images; 1% makes very
small, very blocky images), 85% should still be readable for most documents
in that resolution.</p>
<h2 id="toolongdidntread">Too long, didn’t read</h2>
<p>Here’s the gist for you:</p>
<ul>
<li>Read the article above, it’s very comprehensive :P</li>
<li><p>Use <code>convert</code> on XSane’s multipage images and specify your
scanning resolution:</p>
<pre><code>$ convert image*.pnm image.jpg
$ convert image*jpg -density 200x200 document.pdf
</code></pre></li>
</ul>
<h2 id="furtherreading">Further reading</h2>
<p>There is probably software out there which does those thing for you, with a
shiny user interface, but I could not find one quickly. What I did find though,
was <a href="http://blog.konradvoelkel.de/2013/03/scan-to-pdfa/" title="Konrad Voelkel: Linux, OCR and PDF: Scan to PDF/A">this detailed article</a>, which describes how to get
high-resolution scans wihh OCR information in PDF/A and DjVu format, using
<code>scantailor</code> and <code>unpaper</code>.</p>
<p>Also, Didier Stevens helped me understand stream objects in in his
<a href="http://blog.didierstevens.com/2008/05/19/pdf-stream-objects/" title="Didier Stevens: PDF Stream Objects">illustrated blogpost</a>. He seems to write about PDF more
often, and it was fun to poke around in his blog. There is also a nice script,
<a href="http://blog.didierstevens.com/programs/pdf-tools/" title="Didier Stevens: PDF Tools"><code>pdf-parser</code></a>, which helps you visualize the structure of a PDF
document.</p>
Portal on Linux: fix black screen without textureshttps://rohieb.name/blag/post/portal-on-linux-fix-black-screen-without-textures/rohieb
CC-BY-SA 3.0
2013-11-09T04:24:49Z2013-10-29T03:26:12Z
<p><strong>Problem:</strong> I just bought <a href="http://store.steampowered.com/app/400/">Portal</a> for Linux. When I start the game on my
AMD64 laptop with Debian testing, I only see black objects, and a few light
stripes in between. Everything else works, I can hear sound, I can interact with
objects, and I can look and around, in which case the stripes also move in the
right directions, so they seem to be speckles or reflections rendered on
objects, and only the textures are missing.</p>
<p><strong>Solution:</strong> Searching the Steam forums resulted in nothing (who would have
guessed), but <a href="https://01.org/linuxgraphics/comment/358#comment-358">this forum post</a> suggested to update Mesa to version 9.2
and install <code>libtxc-dxtn</code> or <code>libtxc-dxtn-s2tc0</code>. This packages were not
installed on my system, and the description for the package says that it is used
for texture compression, so it seems to be related. So I first tried to install
the i386 version:</p>
<pre><code>aptitude install libtxc-dxtn-s2tc0 libtxc-dxtn-s2tc0:i386
</code></pre>
<p>After restarting Portal, the problem was gone, so I refrained from updating my
Mesa <img src="https://rohieb.name/smileys/smile.png" alt=":-)" /></p>
<p>Before and after images (probably Copyright by Valve, but I consider this to be
fair use):</p>
<div class="gallery">
<table class="img"><caption>Before installing
`libtxc-dxtn-s2tc0`. Only faint lines are visible.</caption><tr><td><a href="https://rohieb.name/blag/post/portal-on-linux-fix-black-screen-without-textures/testchmb_a_000000.jpg"><img src="https://rohieb.name/blag/post/portal-on-linux-fix-black-screen-without-textures/x200-testchmb_a_000000.jpg" width="250" height="200" alt="Before" class="img" /></a></td></tr></table>
<table class="img"><caption>After installing
libtxc-dxtn-s2tc0, everything is great.</caption><tr><td><a href="https://rohieb.name/blag/post/portal-on-linux-fix-black-screen-without-textures/testchmb_a_000002.jpg"><img src="https://rohieb.name/blag/post/portal-on-linux-fix-black-screen-without-textures/x200-testchmb_a_000002.jpg" width="250" height="200" alt="After" class="img" /></a></td></tr></table>
</div>
Splitting overly large hunks in patcheshttps://rohieb.name/blag/post/splitting-overly-large-hunks-in-patches/rohieb
CC-BY-SA 3.0
2013-10-24T16:30:29Z2013-10-24T16:30:29Z
<p>Today I stumbled over a lengthy patch on my harddisk. It was about half a year
old, and consisted of only one hunk, which was about 1000 lines in length. Most
of the contents were indentation changes from tabs to spaces, but I knew that
the patch contained a small useful portion, which I wanted to extract. What was
slightly more annoying was the fact the the patch did not apply cleanly to the
file it was supposed to change, and <code>patch</code> only applies hunks atomically, the
whole patch was rejected.</p>
<p>Since I did not want to compare each of the lines in the patch visually and
decide whether they changed only whitespace, I tried to look for a way to split
the patch into smaller hunks. My first try was looking at the useful tool in the
<a href="http://cyberelk.net/tim/software/patchutils/">patchutils</a> package, but none of
them did what I wanted, they only allowed me to split patches into single hunks
(but my patch already had only one hunk).</p>
<p>But after a bit of googling, I found out that Emacs has a
<a href="https://www.gnu.org/software/emacs/manual/html_node/emacs/Diff-Mode.html"><code>diff-split-hunk</code></a> command, so I installed Emacs (for the first time
in my life), opened my patch, selected Emacs' Diff mode with <code>M-x diff-mode</code>,
and split the patch into smaller hunks by pressing <code>C-x C-s</code> on appropriate
context lines. After saving, the patch applied cleanly except for two smaller
hunks, which I could easily identify as containing only whitespace changes. Then
I could compare my patched file with the original file, this time ignoring
whitespace changes with <code>diff -w</code>, and, voilà, I got the seven useful lines I
wanted.</p>
<p>For illustration, see the different <a href="https://rohieb.name/blag/post/splitting-overly-large-hunks-in-patches/edit-stages/">edit stages of my patch</a> on a
separate page.</p>
Use Ghostscript to convert PDF fileshttps://rohieb.name/blag/post/use-ghostscript-to-convert-pdf-files/rohieb
CC-BY-SA 3.0
2013-09-19T05:04:01Z2012-06-09T17:06:00Z
<p>If you have a PDF file and want it to be in a specific PDF version (for
example, the print shop where you just ordered some adhesive labels
wants the print master in PDF 1.3, but your Inkscape only exports PDF
1.4), Ghostscript can help:</p>
<pre><code>gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dNOPAUSE -dQUIET \
-dBATCH -sOutputFile=new-pdf1.5.pdf original.pdf
</code></pre>
<p>(this converts the file <code>original.pdf</code> to PDF 1.5 and writes it to
<code>new-pdf1.5.pdf</code>)</p>
<p>Also, if you have a huge PDF of several megabyte because there are many
high-resolution pictures in it, Ghostscript can minify it (and shrink
the pictures to 96 dpi) if you use the parameter
<code>-dPDFSETTINGS=/screen</code>.</p>
Tell Xorg to re-grab the keyboardhttps://rohieb.name/blag/post/tell-xorg-to-re-grab-the-keyboard/rohieb
CC-BY-SA 3.0
2013-09-19T05:04:01Z2011-12-29T22:45:00Z
<p>OK, I was doing some debugging with Xorg, and thought I had to use the
<a href="http://en.wikipedia.org/wiki/Magic_SysRq_key">Magic SysRq</a> key to kill it. But when I had pressed Alt-SysRq-R to
give the keyboard control from Xorg back to the kernel, it turned out
that I not longer needed to do another SysRq because my Xorg magically
worked again… <img src="https://rohieb.name/smileys/smile4.png" alt=";-)" /> Unfortunately now, everytime I pressed Alt-F4 to close
a window, I found myself on tty4… rather poor. So I needed some way to
tell Xorg to grab the keyboard again, and <a href="https://learninginlinux.wordpress.com/2010/06/16/debugging-notes-to-self/">there it is</a>: Just open an
xterm and execute </p>
<pre><code>sudo kbd_mode -s
</code></pre>
X screen shots from the consolehttps://rohieb.name/blag/post/x-screen-shots-from-the-console/rohieb
CC-BY-SA 3.0
2013-09-19T05:04:01Z2011-03-24T18:10:00Z
<p>I had to debug a machine that was behind a DSL-2000 connection, and had
about 20 KB/s of upstream. For that, I needed to see what was going on
on the X screen, but due to the low bandwidth (and a screen resolution
of 1920×1080), VNC was about as fast as 2 frames per minute.</p>
<p>But I found a <a href="http://www.mysql-apache-php.com/website_screenshot.htm">comparable replacement</a>: The <a href="http://www.imagemagick.org/">ImageMagick suite</a>
has a program called <code>import</code> that allows you to dump the contents of
the X screen to a image file. So I took a few screen shots from the
console via <code>DISPLAY=:0 import -window root foo.png</code> and then copied the
files to my machine.</p>
Change partition type without reformattinghttps://rohieb.name/blag/post/change-partition-type-without-reformatting/rohieb
CC-BY-SA 3.0
2013-09-19T05:04:01Z2011-01-02T16:29:00Z
<p>Note to myself: it is possible to change the partition type of a already
formatted (and used) partition. For example, if you have already
formatted the partition with NTFS, but accidentally had created it with
partition type <code>0x83</code> (Linux), so Windows can’t read it, since it expects
<code>0x07</code> (HPFS/NTFS). On Linux, you can use sfdisk for that purpose:</p>
<pre><code># Be root
# dd if=/dev/sdb of=sdb-bootsector count=1 # backup boot sector
# sfdisk -d /dev/sdb | sed -e 's/Id=83/Id=07/' > /tmp/sdb.txt
# sfdisk /dev/sdb < /tmp/sdb.txt
</code></pre>
<p>(fill in the right values for your case)</p>
<p>Of course, good old fdisk works also, use the <code>t</code> command.</p>
<p><a href="http://serverfault.com/questions/46758/can-you-change-the-partition-type-on-a-linux-server-without-starting-up-fdisk/46840#46840">(Source)</a></p>
Windows Device Manager: Code 39 with CDROM drivehttps://rohieb.name/blag/post/windows-device-manager-code-39-with-cdrom-drive/rohieb
CC-BY-SA 3.0
2013-09-19T05:04:01Z2010-11-14T13:29:00Z
<p>My sister asked me to have a look at her notebook (a Medion Akoya P6612
with Windows Vista) because the CDROM drive wouldn’t work, and it was
not even displayed in the Windows Explorer. I looked into the Device
Manager and noticed that the CDROM device (TSSTCorp SN-S083a) was
displayed with a small yellow exclamation mark besides its icon, and it
said on the Properties page that the device could not be started and
referred to Code 39. Reinstalling the drivers had no effect, but after I
had a little chat with <a href="http://google.com/">Big Blue G</a>, I found a <a href="http://www.pchell.com/hardware/cd_drive_error_code_39.shtml">howto entry</a> which
suggested the following:</p>
<ol>
<li>Be logged in with an administrator account</li>
<li>Open the Registry Editor (choose it from the Start Menu or press
Win+R and type <code>regedit</code>)</li>
<li>Navigate to
<code>HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E965-E325-11CE-BFC1-08002BE10318}</code></li>
<li>Then, in the right pane, delete all of the following keys:
<ul>
<li>UpperFilters</li>
<li>LowerFilters</li>
<li>UpperFilters.bak</li>
<li>LowerFilters.bak</li>
</ul></li>
<li>Restart your computer</li>
</ol>
<p>This worked fine for us.</p>
SSH key authentication with encrypted home directorieshttps://rohieb.name/blag/post/ssh-key-authentication-with-encrypted-home-directories/rohieb
CC-BY-SA 3.0
2019-05-05T21:56:21Z2010-10-08T22:00:00Z
<p>Yesterday, I ran into an interesting problem: I tried to set up <a href="http://www.openbsd.org/cgi-bin/man.cgi?query=ssh&sektion=1#AUTHENTICATION">SSH
public key authentication</a> between two of my machines, <code>c3po</code> and
<code>r2d2</code>, so I could log in from <code>rohieb@r2d2</code> to <code>rohieb@c3po</code> without a
passphrase. However, everytime I tried to login to <code>c3po</code>, I was
prompted to enter the passwort for <code>rohieb@c3po</code>, and the debug output
mentioned something that the key could not be verified. More
astonishing, when I established a second SSH connection while the first
was still running, I was <em>not</em> prompted for a password, and debug output
said that key authentication had been sucessful. I googled a bit, and
after a while got to <a href="https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/362427/comments/12">this comment</a> on Launchpad, mentioning problems
when the user on the remote machine had its home directory encrypted
through ecryptfs – which was the case for me. Of course, since ecryptfs
only encrypts the user’s home <em>after</em> he has been authenticated, the SSH
daemon cannot read his <code>~/.ssh/authorized_keys</code> at the first time, and
falls back to password authentication.</p>
<p>The Launchpad comment proposes to first unmount the ecryptfs filesystem,
then store <code>~/.ssh/authorized_keys</code> unencrypted, and then mount the
encrypted home again (<strong>note</strong> that no program should be running that
could try to access your home directory):</p>
<pre><code>$ ecryptfs-umount-private
$ cd $HOME
$ chmod 700 .
$ mkdir -m 700 .ssh
$ chmod 500 .
$ echo $YOUR_REAL_PUBLIC_KEY > .ssh/authorized_keys
$ ecryptfs-mount-private
</code></pre>
<p>This works indeed, but has the drawback that key authentication only
works for the <em>first</em> login, because ecryptfs hides the unencrypted
files when it mounts the encrypted directory on login; and you had to
synchronize the encrypted and the unencrypted version of
<code>authorized_keys</code> everytime you add a new key. To circumvent that, I
simply moved the file to <code>/etc/ssh/authorized_keys/rohieb</code> (with the
file only readable and writable by me, and <code>/etc/ssh/authorized_keys</code>
writeable for all users) and adjusting <code>/etc/ssh/sshd_config</code>
appropriately:</p>
<pre><code>$ sudo vi /etc/ssh/sshd_config # or use your favorite editor instead of vi
[... some lines ...]
AuthorizedKeysFile /etc/ssh/authorized_keys/%u
[... some more lines ...]
$ sudo /etc/init.d/ssh restart
</code></pre>
<h2 id="update">Update</h2>
<p>There is yet a better approach instead, which doesn’t need the SSHd
config to be edited at all:</p>
<ol>
<li>login to the user on the remote machine</li>
<li>create <code>/home/.ecryptfs/$USER/.ssh</code> and put your <code>authorized_hosts</code> there</li>
<li><p>symlink your encrypted version there:</p>
<pre><code>$ ln -s /home/.ecryptfs/$USER/.ssh/authorized_hosts ~/.ssh/authorized_hosts
</code></pre></li>
<li><p>symlink your unencrypted version there (as above, <strong>make sure</strong> no
process wants to write to your home directory in the meantime):</p>
<pre><code>$ ecryptf-umount-private
$ mkdir ~/.ssh
$ ln -s /home/.ecryptfs/$USER/.ssh/authorized_hosts ~/.ssh/authorized_hosts
$ ecryptfs-mount-private
</code></pre></li>
</ol>
<p>The paths are for Ubuntu 9.10 (Karmic Koala) and later. On other
systems, you might want to replace <code>/home/.ecryptfs</code> with
<code>/var/lib/ecryptfs</code>.</p>
ZSNES on AMD64 Ubuntuhttps://rohieb.name/blag/post/zsnes-on-amd64-ubuntu/rohieb
CC-BY-SA 3.0
2019-05-05T21:56:21Z2010-10-05T22:00:00Z
<p><strong>[ Update, 2013-10:</strong> This post post is not up to date anymore. On newer
Debians (since 7.0/wheezy) and Ubuntus (at least since 12.04, Precise Pangolin),
you should be able to install zsnes out of the box: <code>sudo apt-get install
zsnes:i386</code>. For details see the MultiArch documentation for
<a href="https://wiki.debian.org/Multiarch/">Debian</a> and <a href="https://help.ubuntu.com/community/MultiArch">Ubuntu</a>. <strong>]</strong></p>
<p>Before I bought my current hardware, I was working on a 32-bit-based
system, and I really appreciated ZSNES as an SNES emulator. But
unfortunately, my new hardware was an AMD64 system, and there is
currently no ZSNES package for 64-bit Ubuntu or Debian <img src="https://rohieb.name/smileys/sad.png" alt=":(" /> So I decided
to google a bit about the issue, but it took me until now (a year later)
to get ZSNES finally working on my machine. The problem is, if you build
ZSNES on a 64-bit machine, all the application does is segfault at
start, and if you <a href="http://board.zsnes.com/phpBB3/viewtopic.php?p=118067&sid=dd9a2a54d9178eb5009c33586aea703c#p118067">try to compile for 32-bit systems</a>, you get errors
about missing 32-bit libs (in particular, configure does not find a
suitable <code>libsdl</code>). Instead, if you just take the binary which was
compiled on a 32-bit system, and install the <code>ia32-libs</code> package,
everything seems to work—at least I was able to play a few levels of
Super Mario World succesfully <img src="https://rohieb.name/smileys/smile.png" alt=":-)" /> </p>
<p>So here was my idea: take the 32-bit package from the Ubuntu repository,
and just change the Architecture control field, and by this fool dpkg :P
And as it turned out, this idea worked great. You can get the Debian
package here if you want, it <em>should</em> work for Ubuntu Karmic and Lucid,
as well as for Debian testing (<strong>but</strong> I only tested it on Lucid, so
there is no warranty here—but I’m happy to hear if it works :-)):</p>
<ul>
<li><a href="http://rohieb.name/stuff/zsnes_1.510-2.2ubuntu3~ppa1_amd64.deb">zsnes_1.510-2.2ubuntu3~ppa1_amd64.deb</a></li>
<li>SHA1: <code>716bbd37267b477ef02961a7727212619309b83f</code></li>
<li>MD5: <code>452ea5230ad17df1dee649ab4cc6c8c0</code></li>
</ul>
<h2 id="howtoreproduceit">How to Reproduce It</h2>
<p>For the curious people reading here, here is what I actually did:</p>
<ol>
<li><code>wget http://archive.ubuntu.com/ubuntu/pool/universe/z/zsnes/zsnes_1.510-2.2ubuntu3_i386.deb</code></li>
<li><code>ar x zsnes_1.510-2.2ubuntu3_i386.deb</code></li>
<li><code>tar xzf data.tar.gz</code></li>
<li><p>Edit <code>usr/share/applications/zsnes.desktop</code> and added <code>-ad sdl</code> to the
<code>Exec:</code> field, otherwise it would just segfault on the first run:</p>
<pre><code>Exec=zsnes -ad sdl
</code></pre></li>
<li><p>Edit <code>usr/share/doc/zsnes/changelog.Debian.gz</code> and added a new
changelog entry for the version (just copy one of the previous
entries and adapt it)</p></li>
<li><code>tar xzf control.tar.gz</code></li>
<li><p>Edit the <code>control</code> file, changed the <code>Version:</code> and <code>Architecture:</code>
field to <code>amd64</code>, added the <code>ia32-libs</code> dependency, and set myself as
maintainer:</p>
<pre><code>Package: zsnes
Version: 1.510-2.2ubuntu3~ppa1
Architecture: amd64
Maintainer: Roland Hieber <foobar@example.org>
Installed-Size: 4160
Depends: ia32-libs, libao2 (>= 0.8.8), libc6 (>= 2.4), libgcc1 (>= 1:4.1.1),
libgl1-mesa-glx | libgl1, libpng12-0 (>= 1.2.13-4),
libsdl1.2debian (>= 1.2.10-1), libstdc++6 (>= 4.1.1), zlib1g (>= 1:1.2.2.3)
[...]
</code></pre></li>
<li><p>Change the <code>md5sums</code> file for the right values for
<code>usr/share/applications/zsnes.desktop</code> and
<code>usr/share/doc/zsnes/changelog.Debian.gz</code> (I used the <code>md5sum</code>
command and copy-pasted it)</p></li>
<li><code>tar czf control.tar.gz control md5sums postrm postinst</code></li>
<li><code>tar czf data.tar.gz usr/</code></li>
<li><code>ar r zsnes_1.510-2.2ubuntu3~ppa1_amd64.deb debian-binary
control.tar.gz data.tar.gz</code></li>
</ol>
<p>I’m afraid that I can’t put the package into <a href="https://launchpad.net/~rohieb/+archive/ppa">my PPA</a>, Launchpad only
accepts source packages for uploads, and builds the binary packages
itself, both for i386 and AMD64. This approach can not be used here,
since we needed the i386 binary for AMD64.</p>