Thumbnail a website

From Wirespeed

Jump to: navigation, search
Warning: This is experimental code. Don't blame me if your computer combusts spontaneously!

You may tell me though how I can improve this procedure.


Contents

From the khtml2png homepage

  • g++
  • KDE 3.x
  • kdelibs for KDE 3.x (kdelibs4-dev)
  • zlib (zlib1g-dev)
  • cmake
  • khtml2png
  • Xvfb :1&
    1. give the virtual X server some time to start up (maybe you need to increase the value)
  • sleep 3
  • export DISPLAY=localhost:1.0
  • khtml2png2 <parameters>

Fix the system

To make x11-base/xorg-server compile, I had to add a symlink:

Code: Add a missing symlink
ln -s /bin/sed /usr/bin/sed

In the past I've created several of these symlinks, it is worth to check these aswell bacause the errors during compile are very vague to say the least:

Code: Symlink check
# ls -l /usr/bin | grep ' /bin/'
lrwxrwxrwx 1 root   root         15 Mar 31 20:35 awk -> /bin/gawk-3.1.5
lrwxrwxrwx 1 root   root         13 Jun 17 22:48 basename -> /bin/basename
lrwxrwxrwx 1 root   root          8 Apr  1 19:54 cat -> /bin/cat
lrwxrwxrwx 1 root   root         11 Jun 17 22:48 chroot -> /bin/chroot
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 cut -> /bin/cut
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 dir -> /bin/dir
lrwxrwxrwx 1 root   root         12 Jun 17 22:48 dirname -> /bin/dirname
lrwxrwxrwx 1 root   root          7 Jun 17 22:48 du -> /bin/du
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 env -> /bin/env
lrwxrwxrwx 1 root   root          9 Jun 17 22:48 expr -> /bin/expr
lrwxrwxrwx 1 root   root         15 Mar 31 20:35 gawk -> /bin/gawk-3.1.5
lrwxrwxrwx 1 root   root          9 Jun 17 22:48 head -> /bin/head
lrwxrwxrwx 1 root   root         13 Mar 31 20:36 hostname -> /bin/hostname
lrwxrwxrwx 1 root   root         11 Jun 17 22:48 mkfifo -> /bin/mkfifo
lrwxrwxrwx 1 root   root          9 Jul 10 23:08 nano -> /bin/nano
lrwxrwxrwx 1 root   root         11 Jul  3 21:44 passwd -> /bin/passwd
lrwxrwxrwx 1 root   root         13 Jun 17 22:48 readlink -> /bin/readlink
lrwxrwxrwx 1 root   root          8 Jul 27 17:32 sed -> /bin/sed
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 seq -> /bin/seq
lrwxrwxrwx 1 root   root         12 Nov  5  2006 setfont -> /bin/setfont
lrwxrwxrwx 1 root   root         10 Jun 17 22:48 sleep -> /bin/sleep
lrwxrwxrwx 1 root   root          9 Jun 17 22:48 sort -> /bin/sort
lrwxrwxrwx 1 root   root          9 Jun 17 22:48 tail -> /bin/tail
lrwxrwxrwx 1 root   root          8 Sep 20  2006 tar -> /bin/tar
lrwxrwxrwx 1 root   root         10 Jun 17 22:48 touch -> /bin/touch
lrwxrwxrwx 1 root   root          7 Jun 17 22:48 tr -> /bin/tr
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 tty -> /bin/tty
lrwxrwxrwx 1 root   root         10 Jun 17 22:48 uname -> /bin/uname
lrwxrwxrwx 1 root   root          9 Jun 17 22:48 vdir -> /bin/vdir
lrwxrwxrwx 1 root   root          7 Jun 17 22:48 wc -> /bin/wc
lrwxrwxrwx 1 root   root          8 Jun 17 22:48 yes -> /bin/yes

/etc/make.conf

Consider this a delta to the existing /etc/make.conf

File: /etc/make.conf
INPUT_DEVICES=""
VIDEO_CARDS="fbdev"
USE="${USE} -alsa -arts -mp3 -cups -opengl -sis npp"

Ebuild x11-libs/cairo requires USE="X" to enable plugins.

File: /etc/portage/package.use
x11-libs/cairo          X


Build required packages

Notice that kmplayer is emerged to make the flash plugin working, don't remove it from the emerge list if you want flash working properly.

Flash is still an issue.

Code: emerge builds
emerge -kvN \
kdelibs \
zlib \
cmake \
x11-base/xorg-server \
x11-apps/xwd \
kde-base/kdelibs

# packages to get flash working
emerge -kvN \
net-www/netscape-flash \
kde-base/nsplugins \
media-video/kmplayer \ 
kde-base/kreadconfig

emerge -k \
media-fonts/artwiz-aleczapka-en \
media-fonts/corefonts \
media-fonts/dejavu \
media-fonts/encodings \
media-fonts/font-adobe-100dpi \
media-fonts/font-adobe-75dpi \
media-fonts/font-adobe-utopia-type1 \
media-fonts/font-alias \
media-fonts/font-bh-ttf \
media-fonts/font-bh-type1 \
media-fonts/font-cursor-misc \
media-fonts/font-misc-misc \
media-fonts/font-util \
media-fonts/freefonts \
media-fonts/gnu-gs-fonts-std \
media-fonts/terminus-font \
media-fonts/ttf-bitstream-vera \
media-fonts/unifont


Make sure all dependencies are up to date:

Code: dependency update
revdep-rebuild
dispatch-conf
etc-update
DISPLAY=:0.0 kbuildsycoca

Notice that you need to check the exact path for libflashplayer.so, this may vary across platforms.

Code: configure flash
kwriteconfig --file kmplayerrc --group "application/x-shockwave-flash" --key player npp
kwriteconfig --file kmplayerrc --group "application/x-shockwave-flash" --key plugin /usr/lib/nsbrowser/plugins/libflashplayer.so


Code: Update current environment
env-update && source /etc/profile

First check

Time to check the setup so far. Not perfect at the time of writing, but when opening the resulting file in konqueror, an xclock is visible, though heavily interlaced. Use the -br flag to change the checkered background into a solid black one.

Code: Check setup
Xvfb :0 -br &
DISPLAY=:0 xclock &
xwd -display localhost:0 -root | xwdtopnm > /tmp/dump.pmm

To get rid of the interlacing effect just convert to a jpg image, this requires ImageMagick.

Code: convert pmm to jpg
convert dump.pmm dump.jpg

Build khtml2png

Exit the current shell and log back in

Code: Installing khtml2png
cd
wget http://dfn.dl.sourceforge.net/sourceforge/khtml2png/khtml2png-<latest-version>.tar.gz
tar xvzf khtml2png-2.6.7a.tar.gz
cd khtml2png-2.6.7a/
./configure
make
make install

Test khtml2png

Here is the big moment we've all been waiting for ... Let's create a thumbnail image for http://www.gentoo.org/

Code: Create thumbnail
Xvfb :0 -wr &
DISPLAY=localhost:0.0 khtml2png2 --width 1280 --height 1024 http://www.gentoo.org/ output.png
convert output.png -resize 150 output2.png

Image:Thumbnail.jpg

Notice that some fonts do not render correctly. Cause is yet unclear. The coloring of the fonts seem to randomly change on every load of the webpage. Solved: It was a setting for running Xvfb. The -screen parameters were missing. Here is what happens when kthml2png is being executed:

Code: Xvfb :0 -wr
# Xvfb :0 -wr
FreeFontPath: FPE "/usr/share/fonts/misc/" refcount is 2, should be 1; fixing.
Could not init font path element /usr/share/fonts/OTF, removing from list!
Code: khtml2png
# DISPLAY=localhost:0.0 khtml2png2 --width 1680 --height 1050 http://www.gentoo.org/ output.png
Xlib:  extension "XInputExtension" missing on display "localhost:0.0".
Failed to get list of devices
Xlib:  extension "XInputExtension" missing on display "localhost:0.0".
Failed to get list of devices
kbuildsycoca running...
DCOP Cleaning up dead connections.

online website thumbnailer

Create directory for log files

Code:
mkdir /var/log/khtml2png
chown apache /var/log/khtml2png

Make Xvfb run automatically as apache

File: /etc/conf.d/local.start
# Start virtual framebuffers for thumbnailing
for a in 1 2 3 4 5; do
        /usr/bin/Xvfb :$a -screen $a 1280x1024x24 2> /dev/null &
done
File: /etc/conf.d/local.stop
# Stop virtual framebuffers for thumbnailing
/usr/bin/killall /usr/bin/Xvfb
Code: thumbnail.php
#!/usr/bin/perl

== Script to build and serve thumbnails ==
use warnings;
use strict;

# use CGI qw/:standard:/;
use CGI qw(param);
use Digest::MD5  qw(md5_hex);
use Fcntl qw(:flock);


my $query=new CGI;

my $imagedir='/var/www/localhost/htdocs/thumbnail/cache';
my $maxage=7;
my $maxXvfb=5;
my $minXvfb=1;
my $flags="--disable-plugins --time 30";
my $logfile_prefix="/var/log/khtml2png/sess_";

my ( $age, $lockfile, $Xvfb );

# Retrieve parameters from URL
# sdx=website image width
# sdy=website image heigth
# w=thumbnail width
# h=thumbnail height
# url=website URL
my $sdx=param("sdx");
my $sdy=param("sdy");
my $w=param("w");
my $h=param("h");
my $url=param("url");

# Create a unique filename for caching purposes
my $image=${imagedir}."/".md5_hex("${url}.${sdx}.${sdy}.${w}.${h}");
# If image exists and is younger than a week, use the existing image.
if ( -e "${image}.jpg" ) {
        $age = -M "${image}.jpg";
} else {
        $age = $maxage+1;
}

if ( $age > $maxage ) {

        # Display locking and support for multiple Xvfb's
        $Xvfb=${minXvfb};
        do {
                if ( ${Xvfb} < ${maxXvfb} ) {
                        $Xvfb++;
                        $lockfile = "${imagedir}/lock-${Xvfb}";
                        open (LOCKFILE, ">> ${lockfile}");
                } else {
                        sleep 1;
                        $Xvfb=${minXvfb};
                }
        } while ( not flock LOCKFILE, LOCK_EX | LOCK_NB );

                # flock LOCKFILE, 2;

        # Create website image
        # DISPLAY=localhost:0.0 khtml2png2 --width 1280 --height 1024 http://www.gentoo.org/ image.png
        if ( -e "${logfile_prefix}:${Xvfb}.log.1") { unlink("${logfile_prefix}:${Xvfb}.log.1"); }
        if ( -e "${logfile_prefix}:${Xvfb}.log"   ) { rename("${logfile_prefix}:${Xvfb}.log", "${logfile_prefix}:${Xvfb}.log.1"); }
        open(LOGFILE, "> ${logfile_prefix}:${Xvfb}.log") || die "Cannot open file ${logfile_prefix}:${Xvfb}.log for output.\n";
        my ($second, $minute, $hour, $day, $month, $year, $dummy1, $dummy2, $dummy3)=localtime(time);
        $year=${year}+1900;
        print LOGFILE "
========================================
date:   ${year}-${month}-${day} ${hour}:${minute}:${second}
url:    ${url}
flags:  ${flags}
width:  ${sdx}
height: ${sdy}
image:  ${image}.png
size_x: ${w}
size_y: ${h}
";
        close(LOGFILE);
        system("HOME=/tmp DISPLAY=localhost:${Xvfb}.0 /usr/local/bin/khtml2png2 ${flags}  --width ${sdx} --height ${sdy} ${url} ${image}.png 2>> ${logfile_prefix}:${Xvfb}.log");

        # Create thumbnail
        # convert image.png -resize thumbnail.jpg
        system("/usr/bin/convert ${image}.png -resize ${w}x${h} ${image}.jpg 2>> ${logfile_prefix}:${Xvfb}.log");
        if ( -e "${image}.png") { system("/bin/rm ${image}.png 2>> ${logfile_prefix}:${Xvfb}.log"); }

        open(LOGFILE, ">> ${logfile_prefix}:${Xvfb}.log") || die "Cannot open file ${logfile_prefix}:${Xvfb}.log for output.\n";
        ($second, $minute, $hour, $day, $month, $year, $dummy1, $dummy2, $dummy3)=localtime(time);
        $year=${year}+1900;
        print LOGFILE "
date:   ${year}-${month}-${day} ${hour}:${minute}:${second}
FINISHED ${url}
========================================
";
        close(LOGFILE);
        flock LOCKFILE, 8;
        close(LOCKFILE);
}


# Serve thumbnail
print $query->header(-content_type=>"image/jpg");

open(INFILE,"< ${image}.jpg") || die "Cannot open file ${image}.jpg: $!";
        while (<INFILE>) { print $_; }

close(INFILE);

On Ubuntu Server

Code: Install Xvfb
<snapshot.0>
$ sudo apt-get install virtualbox-ose-guest-source virtualbox-ose-guest-utils
$ sudo apt-get install xvfb x11-apps netpbm openssh-server imagemagick cmake gcc g++ \
kdelibs kdebase-runtime 

# header files
$ sudo apt-get install kdelibs4-dev libkde3-java libkde3-jni

# additional fonts
$ sudo apt-get install console-terminus defoma fontconfig kbd kdb libfont-afm-perl \
                       libfontconfig1 libfontenc libfontenc1 libfreetype6 libfs6 \
                       libpango libxfont1 libxft2 psfontmgr ttf-arphic-uming \
                       ttf-dejavu ttf-indic-fonts-core ttf-lao ttf-opensymbol \
                       ttf-thai-tlwg ttf-vlgothic ttf-wqy-microhei xfonts-100dpi \
                       xfonts-75dpi xfonts-base xfonts-scalable


$ Xvfb :0 -screen 0 128x102x24 &
$ DISPLAY=:0 xclock &
$ xwd -display localhost:0 -root | xwdtopnm > /tmp/dump.pmm
$ convert /tmp/dump.pmm /tmp/dump.jpg

<snapshot.1>
$ cd
$ wget http://dfn.dl.sourceforge.net/sourceforge/khtml2png/khtml2png-<latest-version>.tar.gz
$ tar xzf khtml2png-2.7.6.tar.gz
$ cd khtml2png-<latest-version>

$ cd ~jhendrix/khthml2png-<latest-version>

$ ./configure
$ make 
$ sudo make install


$ Xvfb :0 -screen 0 128x102x24 &                                      
[1] 13847                                                            
Could not init font path element /usr/share/fonts/X11/cyrillic, removing from list!
(EE) config/hal: NewInputDeviceRequest failed (2)                                                              
(EE) config/hal: NewInputDeviceRequest failed (2)                                                              
(EE) config/hal: NewInputDeviceRequest failed (2)                                                              
(EE) config/hal: NewInputDeviceRequest failed (2)                                                              
(EE) config/hal: NewInputDeviceRequest failed (2)
(EE) config/hal: NewInputDeviceRequest failed (2)

$ DISPLAY=localhost:0.0 khtml2png2 --visual TrueColor --width 1280 --height 1024 --disable-js http://www.gentoo.org/ output.png
kbuildsycoca running...

$ convert output.png -resize 150 output2.png

Check /etc/fstab for changes on cryptswap.

Personal tools