Libgweather xml copy pasta



  • I was looking at the size of the packages I had installed and noticed libgweather took 80Mb. libgweather apparently "Provides access to weather information from the net", not something you expect to take 80Mb.

    I found most of the space was taken up by xml files.

    $ ls -sh /usr/share/libgweather/
    total 76M
     916K Locations.ang.xml        964K Locations.et.xml   976K Locations.lt.xml     1000K Locations.ru.xml
     960K Locations.ar.xml         948K Locations.eu.xml   952K Locations.lv.xml      916K Locations.rw.xml
     1.1M Locations.as.xml         952K Locations.fa.xml   1.1M Locations.mai.xml     1.0M Locations.si.xml
     932K Locations.az.xml         972K Locations.fi.xml   948K Locations.mg.xml      916K Locations.sk.xml
     948K Locations.be.xml         976K Locations.fr.xml   968K Locations.mk.xml      976K Locations.sl.xml
    1000K Locations.be@latin.xml   920K Locations.ga.xml   1.1M Locations.ml.xml      964K Locations.sq.xml
    1004K Locations.bg.xml         976K Locations.gl.xml   932K Locations.mn.xml      972K Locations.sr.xml
     984K Locations.bn.xml         1.1M Locations.gu.xml   1.1M Locations.mr.xml      968K Locations.sr@latin.xml
     1.1M Locations.bn_IN.xml      976K Locations.he.xml   916K Locations.ms.xml      972K Locations.sv.xml
     936K Locations.bs.xml         1.1M Locations.hi.xml   972K Locations.nb.xml      1.1M Locations.ta.xml
     972K Locations.ca.xml         948K Locations.hr.xml   996K Locations.ne.xml      1.1M Locations.te.xml
     976K Locations.cs.xml         976K Locations.hu.xml   972K Locations.nl.xml      996K Locations.th.xml
     948K Locations.cy.xml         948K Locations.id.xml   948K Locations.nn.xml      972K Locations.tr.xml
     972K Locations.da.xml         972K Locations.it.xml   916K Locations.oc.xml      976K Locations.uk.xml
     972K Locations.de.xml         996K Locations.ja.xml   1.1M Locations.or.xml      976K Locations.vi.xml
    1012K Locations.dz.xml         956K Locations.ka.xml  1008K Locations.pa.xml      916K Locations.xml
     980K Locations.el.xml         1.1M Locations.kn.xml   972K Locations.pl.xml      960K Locations.zh_CN.xml
     952K Locations.en_CA.xml      976K Locations.ko.xml   976K Locations.pt.xml      976K Locations.zh_HK.xml
     988K Locations.en_GB.xml      936K Locations.ku.xml   988K Locations.pt_BR.xml   976K Locations.zh_TW.xml
     972K Locations.es.xml         956K Locations.ky.xml   972K Locations.ro.xml      4.0K locations.dtd
    

    Lets take a look a couple of these xml files.

    The start of Locations.en_GB.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE gweather SYSTEM "locations.dtd">
    <gweather format="1.0">
    <region>
      <name xml:lang="en_GB">Africa</name>
      <country>
        <name xml:lang="en_GB">Algeria</name>
        <iso-code>DZ</iso-code>
        <fips-code>AG</fips-code>
        <timezones>
          <timezone id="Africa/Algiers"/>
        </timezones>
        <tz-hint>Africa/Algiers</tz-hint>
        <city>
          <name xml:lang="en_GB">Adrar</name>
          <coordinates>27.900000 -0.283333</coordinates>
          <location>
            <name>Adrar</name>
            <code>DAUA</code>
            <coordinates>27.883333 -0.283333</coordinates>
          </location>
        </city>
    

    The start of Locations.fr.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE gweather SYSTEM "locations.dtd">
    <gweather format="1.0">
    <region>
      <name xml:lang="fr">Afrique</name>
      <country>
        <name xml:lang="fr">Algérie</name>
        <iso-code>DZ</iso-code>
        <fips-code>AG</fips-code>
        <timezones>
          <timezone id="Africa/Algiers"/>
        </timezones>
        <tz-hint>Africa/Algiers</tz-hint>
        <city>
          <name xml:lang="fr">Adrar</name>
          <coordinates>27.900000 -0.283333</coordinates>
          <location>
            <name>Adrar</name>
            <code>DAUA</code>
            <coordinates>27.883333 -0.283333</coordinates>
          </location>
        </city>
    

    Looks like every file contains the exact same data just with localised place names.



  • XML there is a WTF of course, but... what kind of distro installs all languages at once?

    [vi@hedgehog ~]$ ls -lh /usr/share/libgweather
    razem 928K
    -rw-r--r-- 1 root root  968 2008-10-24 05:57 locations.dtd
    -rw-r--r-- 1 root root 919K 2008-10-24 05:57 Locations.xml



  • So I guess breaking out their localizable/unlocalizable data into separate units and merging them together at runtime never came up? Although in my experience not a lot of developers even consider localization until their product's almost ready to ship. Not a very efficient use of drive space but I guess as long as you never ever have to edit what's in those files it's not a big deal. Heck I think there's even tools that will localize XML content for you. Then again maybe their "tool" was just a perl script with a key/value pair of strings to translate. Having all your localized files in one huge directory would DEFINITELY be annoying, though...



  • They can't be all the same: they have different sizes, but this is what localization has come to: get the xml-file with an extension based on an iso-code, no matter if the contents are identical or not. Not even an attempt to zip the files. Who cares about 80Mb these days? Lazy, lazy, lazy.



  • @viraptor said:

    -rw-r--r-- 1 root root 919K 2008-10-24 05:57 Locations.xml
     

    TRWTF: Who installs their software/distro on 6 AM in the morning?



  • @dtech said:

    @viraptor said:

    -rw-r--r-- 1 root root 919K 2008-10-24 05:57 Locations.xml
     

    TRWTF: Who installs their software/distro on 6 AM in the morning?

    TRWTF: the automatic assumption that system time is local time.  WTF wants to have daylight savings time problems twice a year?  Also, many of us manage servers in many timezones; to be able to correlate log files easily, a common timezone is vital.

    For what it's worth, before I got married, I was frequently still up at 6AM on Saturday.



  • @tgape said:

    @dtech said:

    @viraptor said:

    -rw-r--r-- 1 root root 919K 2008-10-24 05:57 Locations.xml
     

    TRWTF: Who installs their software/distro on 6 AM in the morning?

    For what it's worth, before I got married, I was frequently still up at 6AM on Saturday.

    1. You beat me to it, but yeah... it's 6AM at night, not 6AM in the morning.

    2. It's the package creation time, not local installation time:

    [vi@hedgehog ~]$ sudo apt-get install dc
    ...
    [vi@hedgehog ~]$ ls -la `which dc`
    -rwxr-xr-x 1 root root 32712 2007-12-05 12:33 /usr/bin/dc
    [vi@hedgehog ~]$ LANG=C date
    Mon Apr 20 02:25:42 BST 2009

     



  • @viraptor said:

    XML there is a WTF of course, but... what kind of distro installs all languages at once?

     

    [vi@hedgehog ~]$ ls -lh /usr/share/libgweather
    razem 928K
    -rw-r--r-- 1 root root 968 2008-10-24 05:57 locations.dtd
    -rw-r--r-- 1 root root 919K 2008-10-24 05:57 Locations.xml
    You seem to use Debian (based on your apt-get example above), yet you only have a single xml file in that directory.  I'm confused.  What packages have you installed to achieve that feat?  The Debian package tracking system lists all of the files for libgweather-common, and my system (unstable, last update yesterday) agrees.  I complained about the enormous amount of xml files way back, and since then they've gzipped them, reducing the total size to "only" 13MB.


  • I'm on Ubuntu 8.04 - which also uses apt and is based on debian - and I have only 1 file, Locations.xml.

    Quote from viraptor:

    [vi@hedgehog ~]$ sudo apt-get install dc

    (Hoary?) Hedgehog was actually an Ubuntu release.

    edit: i fail at quoting



  • My libgweather-common also includes only one file. It's ubuntu 8.10

    "hedgehog" is just the name of my main machine (it's a tradition since... ~ slackware 3.5)



  • @tgape said:

    TRWTF: the automatic assumption that system time is local time.  WTF wants to have daylight savings time problems twice a year? 

    WhoTF uses a distro that doesn't take care of that. Seriously, I can't think of any reason to not keep your time corrent.

    @tgape said:

    Also, many of us manage servers in many timezones; to be able to correlate log files easily, a common timezone is vital.

    Yeah, because libgweather ("a weather library for the GNOME desktop") is frequently found on servers. Furthermor: having the incorrect system time is not a very smart way to correlate log files. It's better to just save the timestamps in UTC.

     @tgape said:

    For what it's worth, before I got married, I was frequently still up at 6AM on Saturday.

     I'm too. Only usually thats when I just came back home, not when I'm booting up my computer and install/update some software... (I don't even know if I can type my root pwd correctly then :))



  • @dtech said:

    Furthermor: having the incorrect system time is not a very smart way to correlate log files. It's better to just save the timestamps in UTC TAI64.

    FTFY


Log in to reply