From newsbeuter to maildir

Here is a small Python script which can help you migrate the RSS items you have kept under newsbeuter to a maildir-based system. It doesn't pretend to be an extra-quality code and I had less than 2 hours to discover SQLite3 and mailbox python modules to build the script. Some elements are hardcoded: you'll have to manually indicate to which rss feed URL (stored in newsbeuter) corresponds the maildir directory.

Here is the code:

  
  #!/usr/bin/python
  # -*- coding: utf-8 -*-

  # Put your newsbeuter cache messages in Maildir directories

  # This program is free software: you can redistribute it and/or modify
  # it under the terms of the GNU General Public License as published by
  # the Free Software Foundation, either version 3 of the License, or
  # (at your option) any later version.

  # This program is distributed in the hope that it will be useful,
  # but WITHOUT ANY WARRANTY; without even the implied warranty of
  # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  # GNU General Public License for more details.

  # You should have received a copy of the GNU General Public License
  # along with this program.  If not, see <http://www.gnu.org/licenses/>.

  # Newsbeuter stores its feeds in ~/.newsbeuter/cache.db. This file is a
  # SQLite3 database with 2 tables. The table named rss_item contains all
  # of the RSS/Atom messages. The "body" of the message is stored in HTML
  # (without the headers).

  # This Python script converts the messages of your cache.db file to
  # Maildir repositories. The RSS/Atom messages are put in their
  # respective Maildir...

  # modules import
  from email.mime.text import MIMEText
  import email.utils
  import mailbox
  import sqlite3
  import sys

  # Dictionary for feedurl to Maildir directories association:
  # You have to manually change thoses references for your feeds
  # Keys are feedurl from newsbeuter, items are the name of the sub-maildir
  feed2mail = { 'http://www.maitre-eolas.fr/feed/atom':'Feeds.Eolas',
                'https://medspx.fr/blog/index.rss20':'Feeds.WhereIsIt',
                'http://planet.debian.net/rss20.xml':'Feeds.planetDebian',
                'http://feeds2.feedburner.com/hackaday/LgoM':'Feeds.HackADay',
                'https://www.adafruit.com/blog/feed/':'Feeds.Adafruit',
                'https://linuxfr.org/news.atom':'Feeds.LinuxFr',
                'http://www.bortzmeyer.org/feed-full.atom':'Feeds.Bortzmeyer',
                'http://www.la-grange.net/feed.atom':'Feeds.LaGrange'}

  # Part 1: Arguments analysis
  if len(sys.argv) < 2:
      print 'newsbeuter2maildir.py cache.db_file maildir'
      exit(1)

  cache_file = sys.argv[1]
  maildir = sys.argv[2]

  print 'extracting from %s to %s ...' % (cache_file, maildir)

  # Part 2: extract messages from cache.db
  cache = sqlite3.connect(cache_file)
  destination = mailbox.Maildir(maildir)

  c = cache.cursor()
  # We only want to extract the items that have been read and which are
  # not deleted
  c.execute('select title, author, feedurl, pubDate, content from'
           + ' rss_item where unread=0 and deleted=0')
  # for every line, we put a message in the right mailbox
  for line in c:
      header = u'<html><head></head><body>'.encode('utf-8')
      footer = u'</body></html>'.encode('utf-8')
      msg = MIMEText(header+line[4].encode('utf-8')+footer, 'html')
      msg['Subject'] = line[0].encode('utf-8')
      msg['From'] = line[1].encode('utf-8')
      msg['Date'] = email.utils.formatdate(float(line[3]))
      if line[2] in feed2mail.keys():
          feed = feed2mail[line[2]]
          # If the maildir doesn't exist, we create it.
          if feed not in destination.list_folders():
              print "maildir", feed, "doesn't exist: we create it !"
              maildir_feed = destination.add_folder(feed)
          else:
                maildir_feed = destination.get_folder(feed)
          # Add the message to the maildir and mark it unread:
          msg_in_feed = mailbox.MaildirMessage(msg)
          msg_in_feed.set_flags('S')
          msg_key = maildir_feed.add(msg_in_feed)
      else: 
          print "Unknown Feed:",line[2]

  destination.close()

  exit(0)