Problems with EPEL and Fedora mirroring: Many Root Cause Analysis

There was a problem with EPEL and Fedora mirrors for the last 24 hours where people getting updates would get various errors like:

Updateinfo file is not valid XML:

The problem was caused by a problem in the compose which output the XML file not as xml but as sqllite. The problem was fixed within a couple of hours on the Fedora side, but it has taken a lot longer to fix further downstream.

  • Some of the Fedora mirror containers were not updating correctly. We use a docker container on each proxy to keep the data fresh. 4? of the 14 proxies said they were updating but seem to not do so. These servers were our main ipv6 servers so people getting updates from these were more affected than other users. 
  • Some mirrors only update 1 or 2 times a day (or even slower). This means that your favourite mirror may keep the data for 12 to 48 hours. 
  • Some client plugins like to peg to a quickest mirror to try and keep downloads fast. While we may tell you that there are 20 mirrors up to date, the plugin will use the one it got stuff fastest from in the past. This means you can end up with going to a 'broken' mirror for a lot longer.
  • Some yum/dnf systems seem to have other options set to keep the bad xml file until it 'ages' out. This means that while an updated xml is there, some systems are still complaining because their box already has it.
The fixes on the Fedora side are to put in better tests to try and see that this does not happen again. The client side fixes are currently to do either one of the following:

  • yum clean all
  • yum clean metadata
Thank you all for your patience on this problem.


Call for Papers: Flock to Fedora 2017

In summer, an old engineer's fancy turns to writing paper proposals. For it is time for people to submit papers to https://flocktofedora.org/. This year, Flock is being held in Cape Cod Massachusetts from August 29 to September 01. Flock is also focusing on being a 'get-er-done' conference where workshops on getting software problems worked on by many people will have focus. So do you have something you have wanted to get done in Fedora that you needed to have a bunch of people from around the US and Europe to focus on? Put together a short proposal and submit it to https://register.flocktofedora.org/  [Oh and make sure that the people who you need to work with know about it.. and agree that they want to do it also. Surprise is the opposite of consensus.]

The CFP ends on July 15th 2017. Good luck. I am putting in a proposal for a fast moving EPEL workshop. For a more complete post on FLOCK talk/workshop requirements please see http://blog.linuxgrrl.com/2017/06/08/propose-a-talk-for-flock/


The steam roller of life

Some days it really feels like you are the last man standing as the zombie horde rolls in, and sometimes it feels like people just seem to scream stop at every little thing. However, a lot of times it just looks like this to everyone else:

The security guard is doing his job and is the hero of his own story (in fact has an extra on DVD about his family.) He is trying to get the 'villians' to stop. Austin Powers is the hero in his story because he is just trying to get to the other side of the room to stop Doctor Evil. The vast gulf between the two is just how far apart and how little danger there really is. It is also a story about how avoidable the inevitable crunch at the end is.

  1. The guard could have stood to the left or right and let the steamroller go by. [The guard could have also shot Austin or something else.]
  2. Austin could have 'swerved to the left or right' just a little and missed the guard. [Or he could have gotten out and gotten there faster.]
OK so you are thinking "Yes Captain Obvious that is exactly the humour being shown here.. thank you for breaking it down for us..." The point I am looking at is how often this mirrors our online community problems. Someone is trying to accomplish something, and someone for whatever reason yells stop. (Or someone is meant to keep something stable, and someone is ramming through a new paradigm). Those of us in the moment get caught up in all the energy, and  we forget that to most people outside that all they see is how avoidable the whole confrontation was. 
Sometimes we feel that it is better to get run over by the steamroller than take a step left or right. Sometimes we feel that putting the pedal to the metal on the steamroller is going to make this so much faster, and we can't move it to the right or left for a small change. 


Canaries in a coal mine (apropos nothing)

[This post is brought to you by Matthew Inman. Reading http://theoatmeal.com/comics/believe made me realize I don't listen enough and Verisatium's https://www.youtube.com/watch?v=UBVV8pch1dM made me realize why thinking is hard. I am writing this to remind myself when I forget and jump on some phrase.]

Various generations ago, part of my family was coal miners and some of their lore was still passed down many many years later. One of those was about the proverbial canary. A lot of people like to think that they are being a canary when they bring up a problem that they believe will cause great harm.. singing louder because they have run out of air.

That isn't what a canary does. The birds in the mines go silent when the air runs out. They may have died or are on the verge of being dead. They got quieter and quieter and what the miners listened for was the lack of noise from birds versus more noise. Of course it is very very hard to hear the birds in the first place in a mine because they aren't quiet places. There is hammering, and shoveling and footsteps echoing down long tubes.. so you might think.. bring more birds.. that just added more distractions and miners would get into fights because the damn birds never shut up. So the birds were few and far between and people would have to check up on the birds every now and then to see if they were still kicking. Safer mines would have some old fellow stay near the bird and if it died/passed out they would begin ringing a bell which could be heard down the hole.

So if analogies were 1:1, the time to worry is not when people are complaining a lot on a mailing list about some change. In fact if everyone complains, then you could interpret that you have too many birds and not enough miners so go ahead. The time to worry would be when things have changed but no one complains. Then you probably really need to look at getting out of the mine (or most likely you will find it is too late).

However analogies are rarely 1:1 or even 1:20. People are not birds, and you should pay attention to when changes cause a lot of consternation. Listen to why the change is causing problems or pain. Take some time to process it, and see what can be done to either alter the change or find a way for the person who is in pain to get out of pain.


Moving EPEL-4 and EPEL-5 to archives

Today we say goodbye to the last parts of EPEL-5 (and also EPEL-4). The top level files in /pub/epel/4 and /pub/epel/5 were moved to /pub/archive/epel so that people who are still needing packages can get them from the archives. People using yum should not see any change in updates because mirrormanager had the changes to point to archives a couple of days previously.

For any kickstarts or scripts that used the main download servers all that needs to be done is change:




and you can have your kickstart scripts grab the epel rpm from


Thanks again to everyone who has helped with EPEL-5 over the years. It was a good crazy ride.


EPEL-5 article appearing on FedoraMagazine.org

So I thought I was not writing anything more about the EOL of EPEL-5, but I got asked by several people why no one had written anything about it 😐. The ability of my posts to reach the world was much smaller than I realized. In order to rectify that a bit, here is another article on the EOL of EPEL-5 this time at Fedora Magazine.


IMPORTANT REMINDER: EL 5 is EOL on March 31. 2017

This is probably my final reminder on this before April 3rd 2017. As listed at https://access.redhat.com/support/policy/updates/errata and https://en.wikipedia.org/wiki/Red_Hat_Enterprise_Linux#Product_life_cycle Red Hat Enterprise Linux will be exiting "Production Phase 3", and CentOS will be archiving off old EL-5 releases.

At that point, all remaining EPEL-5 packages will be archived to /pub/archive/epel/5 for systems to get data from. No new updates or packages will be done after that.


Trying to get an idea about what packages are used


One of the questions I get asked a lot is "You provide various statistics for Fedora, can you show which packages are installed the most?"

To head off a lot of future requests, the answer is no, no I can't. We do not have any sort of popcorn database which shows what packages are popular. When a user requests the OS to install a package, there is no "Hey I am asking for Bob if I can install libfoobar" that gets sent to the Fedora servers. What yum, dnf, PackageKit, or Salt do is then request for the repo data, looks to see if there is a way to figure out what is wanted and then asks for any packages that it needs to get.

It is this data that I can sort of glean some sort of idea of most installed packages.. but I feel it is way past "Lies", "Damn Lies", and "Statistics" into regions like  "Political Promises" or "Half Life 3 confirmed". Looking over an entire month of requests, sorting the data, and ranking the requests, I find that a bunch of packages show up a lot while others fall off in a long tail. Things that make this data dirty are the fact that if 200 people ask for wordpress, 150 for mediawiki and 90 for nagios.. I will see various PHP trunk packages that all three want as a higher number. I can't simply tell if the person wanted that PHP package by itself or wanted wordpress. [I could possibly try and work out a transaction of requested packages and figure out what nodes and leafs there might be.. but I found that the tools don't always request from download.fedoraproject.org everything it is wanting because it possibly already 'knows' where something is.

In any case, here are the most requested packages to the download website for January.


  1. epel-release-7-9
  2. python2-pip-8
  3. python2-boto-2
  4. openvpn-2
  5. php-tcpdf-6
  6. php-tcpdf-dejavu-sans-fonts-6
  7. pdc-updater-0
  8. duplicity-0
  9. nagios-plugins-2 *lots of plugins show up here*
  10. ansible-2
  11. libopendkim-2
  12. opendkim-2
  13. cowsay-3
  14. python2-wikitcms-2
  15. pkcs11-helper-1
  16. fedmsg-0
  17. htop-2
  18. munin *lots of munin packages here
  19. awscli-1
  20. hdf5-1


  1. nagios-plugins-2 *lots of other nagios removed*
  2. libmcrypt-2
  3. nodejs-0 *lots of other nodejs removed*
  4. python2-boto-2
  5. GeoIP-1 *other GeoIP removed*
  6. geoipupdate-2
  7. nrpe-2
  8. libnet-1
  9. denyhosts-2
  10. eventlog-0
  11. syslog-ng-3
  12. epel-release-6-8
  13. php-pear-Auth-SASL-1
  14. php-pear-Net-SMTP-1
  15. php-pear-Net-Socket-1
  16. perl-Net-IDN-Encode-2
  17. perl-Net-Whois-Raw-2
  18. perl-Regexp-IPv6-0
  19. pwhois-2
  20. v8
EPEL-6 is our most popular distribution with a ratio of about 12 EPEL-6 : 7 EPEL-7: 1.5 Fedora 25 to 1 EPEL-5 request over the month of January. 


  1. R-core-3 *lots of other R packages removed*
  2. globus-gssapi-gsi-devel-12 *lots of other globus removed*
  3. nordugrid-arc-5
  4. xrootd-client-libs-4 *lots of other xrootd removed*
  5. pcp-libs-devel-3
  6. nordugrid-arc-devel-5
  7. libopendkim-2
  8. libopendmarc-1
  9. pcp-libs-3
  10. nordugrid-arc-plugins-globus-5
  11. libopendkim-devel-2
  12. libopendmarc-1
  13. ebtree-6
  14. myproxy-libs-6
  15. mosh-1
  16. lua-cyrussasl-1
  17. drupal7
  18. rear-2
  19. clustershell-1
  20. rsnapshot-1
I found it interesting that R was getting pulled in by a lot of computers on EPEL-5. This OS is almost end of lifed, but it looks like systems are still getting provisioned with it.

Fedora 25

  1. java-1
  2. vim-minimal-8
  3. kernel-core-4
  4. libX11-1
  5. perl-libs-5
  6. perl-5
  7. perl-IO-1
  8. perl-macros-5
  9. perl-Errno-1
  10. nss-3
  11. gdk-pixbuf2-2
  12. gtk3-3
  13. audit-libs-2
  14. nss-softokn-freebl-3
  15. libX11-common-1
  16. gdk-pixbuf2-modules-2
  17. libnl3-3
  18. gnutls-3
  19. pcre-8
  20. gtk-update-icon-cache-3
As can be seen from the Fedora 25, there is another problem with my trying to get an idea of packages.. a package getting updated that is installed on a lot of boxes will show up also. 


I really don't think any 'real' conclusions can come out of this other than people really want vim on their Fedora 25 desktops (emacs was way down the list). 😑 I also want to say that we should get an opt-in popcorn for Fedora :).

[Edited: I forgot this part]

This list of agents which get used to pull down packages for EPEL and Fedora was rather interesting. I combined all the yum together as the many different versions kind of polluted the numbers but here are the top agents:

  1. yum
  2. Salt
  3. dnf
  4. Artifactory
  5. python-requests
  6. Debian Apt-Cacher-NG
  7. PackageKit-hawkey
  8. Axel 2.4 (Linux)
  9. Wget
  10. libdnf
  11. curl
  12. urlgrabber
The Salt seems to come from a large number of amazon systems which are installing either epel-release-6 (80% of the time) or epel-release-7 (20% of the time). Nothing else seemed to be 'pulled' from download.fedoraproject.org so it is probably just a config artifact on bootup. 


Major update to Nagios in Fedora Rawhide and EPEL-7 [moving to 4.2.4]

After a couple months of work, I have put together an updated package for Fedora Rawhide and EPEL-7 today. I expect it will have some 'problems' and so have moved the needed karma to 4 and am looking for people to test and give it negative karma with feedback for items broken.

I will work on getting those done this week so we can try and have working versions of Nagios for Fedora Server 26 and EPEL. Currently I expect it to need changes to the selinux policies for both and may need some additional work there. I am working through the processes for getting those done.

EPEL-6 will need some more work because the rpmbuild is complaining that it can't make /var/run/nagios for some reason.

Creating a new update for  nagios-4.2.4-2.el7 
  Update ID: FEDORA-EPEL-2017-0f3297a19b
    Release: Fedora EPEL 7
     Status: pending
       Type: security
      Karma: 0
    Request: testing
       Bugs: 1288989 - None
           : 1289710 - None
           : 1299166 - None
           : 1322666 - None
           : 1329857 - None
           : 1330627 - None
           : 1341683 - None
           : 1405365 - None
           : 1411399 - None
      Notes: Major Update. Fixes various CVE and other issues.
  Submitter: smooge
  Submitted: 2017-02-07 23:46:16
   Comments: bodhi - 2017-02-07 23:46:17 (karma 0)
             This update has been submitted for testing by smooge.


Major update to Fedora/EPEL moving to nrpe-3.0.1

The version of nrpe in Fedora has been 2.15 for a very long time while the upstream Nagios group moved to a 3.0 series. With some work and a lot of help from my friends, I have put an updated nrpe into EPEL-testing for EPEL-6 and EPEL-7 and in Fedora Rawhide.


I have put the EPEL update karma for this to be 4 versus 3 as I would like some more testing done by people before it gets working. If it gets a lot of negative karma I will pull it and work with upstream to get a working version into EPEL.


This version of nrpe was the 'fun' one. This is due to the fact that the newer OpenSSL does not allow for introspection of various structures which it used to. Working with Tomas Mraz and Patrick Uiterwijk, I believe I have a semi-working version. [A secondary problem was that I had to pull some sslv2 code out because we do not ship with those libraries anymore. I am hoping upstream will come up with a better fix than my hacksaw method.]

Fedora 25

I have not put in an update to Fedora 25 because it is a major update and was not listed as a change request. I am looking through what needs to be done for this, and when I have gotten any approvals needed will publish it to Fedora 25 testing.


Reminder to self: Using Dennet's principles

These last couple of weeks have been a complete emotional and intellectual mess for me. People are arguing to the right of me, to the left of me and everyone seems to have pulled out their favorite purity test to try and prove if someone is good enough to be in their camp.

In trying to come up with clearer axioms to gauge all the various poop-storms without getting into emotives or purity tests, I ran into this article about Daniel Dennet's tools for trying to be a critical thinker. I will report the paraphrase below:

  1. Accept you make mistakes, and then use them to be a better person.
  2. Respect your opponent. 
  3. Beware of "surely" as it is overused as a rhetorical device to avoid critical thinking by assuming something is sure.
  4. Answer rhetorical questions. As with 'surely', using a rhetorical question is a way to avoid thinking about something by being facetious.
  5. Employ Occam's Razor
  6. Employ Sturgeon's Law. 90% of everything is rubbish... [which cuts both ways.] Don't waste your time defending rubbish and don't waste your time attacking it. Work on the 10-20% which isn't.
  7. Avoid deepities... things which are deep and profound but not well defined. [Another way of looking at it is "If it sounds too good to be true, it probably is".] 
Anyway thanks to Jonathan Corbet of lwn.net for reminding me of the Sturgeon's law wikipedia article which lead me to that piece.


Mea Culpa: Fedora Elections

As announced here, here , and here, the Fedora Election cycle for the start of the 25 release has been done. Congratulations on the winners. Now if you notice there were less than 250 voters for any of the elections out of multiple thousand of eligible voters.. I am not one of them.

It is not like the elections were announced before, at the start, or right before they ended. Yet somehow.. I missed everyone of these emails. I caught various emails on NFS changing configurations, proposed changes to Fedora 26 and 27, or various retired packages.. but I completely spaced the elections. I was actually sending an email asking when they would be held when someone congratulated Kevin Fenzi on IRC about winning.

So to the winners of this cycle of elections. Congratulations. To all the people who put in the hard work of running elections (and having run several it is a LOT of hard work), my sincere apologies for somehow missing it.


Fedora/EPEL Mirrormanager problems in Asia Pacific countries.

We have been getting a lot of reports of people unable to get updates for EPEL or Fedora at various times. What people are seeing is that they will do a 'yum update' and it will give a long list of failures and quit. At this moment we seem to have pinpointed that most of the people having this problem are in various Asia Pacific nations (primarily Australia and Japan). The problem for both of these seems to be a lack of cross connects between networks.

In the US, if you are on Comcast in say New Mexico and going to a server on Time Warner in North Carolina, your route is usually pretty direct. You will go from one network to various third party providers who will then send the packets the quickest path to the eventual server. If you use a visual grapher of locations, you even find that the path usually follows a linear path. [You might end up going to say California or Seattle first but that is only when Texas and Colorado cross connects are full.] Similarly in most European countries you also see a similar routing algorithm.

In the various Pacific and Indian Ocean countries, you do not see similar interconnects. You can watch a system in Sydney on one network send packets all the way to San Francisco and back again to a server in Sydney because the two telecoms do not 'talk' with each other. This seems to happen also in Japan for a couple of telecom networks. The result of this is that it is much more expensive to mirror data in those countries than you would think. For users it might be faster to get data from mainland China or the United States than it is to get it from a server only hundreds of miles away.

The problem is that mirrormanager is currently not coded to deal with that. It makes an optimistic assumption that you are in Adele and the nearest server is in Sydney.. you should go to that. The mirror in Sydney though is still catching up with data from pulling things in the mainland US (or if the mirror admin made the assumption that an asia pacific mirror is the one to go to.. may be pulling data from a server 20 miles physically and several tens of thousand miles away by network.) The mirrormanager developers are trying to figure out ways to deal with this without making servers and clients having to send each other network maps with throughput charts to figure out things.. [And no the fastest mirror yum plugin doesn't fix this for all/most people. It uses a very very simple 'works for me' test to figure out what mirrors might be a good match at one point in time. You could end up with using a poor mirror 90% of the time but the one time it set itself up.. it also just uses the "hey ping is fast" dynamic which breaks for people on various networks. Improving the fastest mirror plugin would be useful if someone did it.]

So what to do? For EPEL, the current fix is to edit your /etc/yum.repos.d/epel.repo files by adding a '&country=global'

name=Extra Packages for Enterprise Linux 7 - $basearch

This will cause yum to ask for the global versus 'local' and you will get all the mirrors. This usually will give a few servers which are in sync even if they are 'not' local.