newstap - retrieve news articles and deliver them as mail messages
newstap
newstap is a tool for retrieving news articles from various sources and delivering them in a variety of ways as mail messages. Basically, when run, it iterates through a set of user-configured news sources, retrieving new articles, and dispatching them through user-configured delivery methods.
Currently, news retrieval is limited to NNTP (per RFC 977) servers. For testing (or silliness), there is also a simulated news retrieval method which actually just runs /usr/games/fortune to generate messages. Other retrieval methods are planned, hopefully to cope with things like secure NNTP or Web-based news sources.
Regardless of retrieval method, newstap follows a model where each news source specifies a retrieval method, a server, and optionally a port. Within a news source, you must specify one or more groups. Clearly, newstap is oriented, start to finish, towards USENET news retrieval. There are a set of options, including the delivery method, which can be specified at any level from globally down to per-group.
All news articles are kept in RFC [2]822 format. This is the standard format for both email and USENET news. By default, newstap adds a few headers, including the standard Received: header, before delivery; this is configurable in the options as well.
At startup, newstap loads its configuration from a newstaprc file. This file is typically located in your home directory at ~/.newstaprc or at ~/.newstap/newstaprc. It is a structured file which completely configures newstap. With this version, you must have this file; there is no other way to tell newstap what to do.
Since newstap also needs to remember things in between invocations, such as the last message you retrieved in each group, it also requires a state file. This file is, by default, in ~/.newstap.state; it may also be in ~/.newstap/state, or in a custom location specified in your newstaprc file. This file will be created if it does not exist, and will be overwritten each time newstap is run. Theoretically, it is plain human-editable ASCII (and actually uses a variant of newstaprc's format), but you should not edit it. Its format may change unexpectedly, and I'm not going to document it anywhere but in the source code.
A newstaprc file is a plain text configuration file. If you want a quick start, skip down to the EXAMPLES section, copy and paste, and edit them. Use this section as a reference.
Parsing is line-oriented; so don't go splitting statements over multiple lines or putting multiple statements on a line. That won't work. All keywords are case sensitive in this version.
Comments may occur anywhere, and are denoted by the usual shell-style `#' character. The line is truncated at the first occurrence of this character. Blank lines are allowed. Leading and trailing space is removed from each line before parsing. Whitespace is whitespace; feel free to separate words by tabs or whatever. Finally, note that words are delimited ONLY by whitespace; so, unlike some other formats, `foo{' is one token and not two. Make sure you put whitespace where I put it in my descriptions.
newstaprc allows certain types of blocks to occur: these are denoted by a statement ending with a `{', which begins the block, and the special statement '}' (alone on a line), which ends the current block.
As in the shell, newstaprc files allow you to insert the values of environment variables or your home directory using one of the following forms:
Text Gets Replaced With
---- ------------------
${name} The value of the environment variable `name'.
$name The value of the environment variable `name'.
~ Your home directory.
The ${...} form is less ambiguous than the $... form. ~ first looks for a ${HOME} value; if that is not defined, it reads your passwd file entry. These may all occur anywhere within any line in the file, and are interpreted when the file is read.
These statements don't directly affect any newstap settings; they are useful when constructing and testing RC files.
The following statements set options which can occur anywhere in the file. Per-group options override per-server options; per-server options override global options; global options override defaults. You get the picture.
args, if present, will be scanned for % characters and formatting will be done; see FORMATTING STRINGS, below.
Note that this is an approximate truncation value. Typically, the body text will actually be rounded up to the next line.
value will be scanned for % characters and formatting will be done; see FORMATTING STRINGS, below.
value will be scanned for % characters and formatting will be done; see FORMATTING STRINGS, below.
Note that statements within a group override those for the enclosing server, which override those outside of any server blocks. This gives you finer-grained control over which news sources may consume your bandwidth.
These options can only appear at the global level. They make no sense within any blocks.
These statements can only appear immediately within a server block.
newstap supports a small set of message delivery methods. It is extensible at the source code level to support new methods; however, it is more flexible to use a well-known method such as standard `mbox' format files and use other tools for ultimate delivery.
Current delivery methods include:
If name is a regular file, then by default newstap will attempt to lock it by creating a link named ``name.lock''. If it cannot obtain this lock because some other process has, it will wait until it can. This locking behavior can be disabled by prefixing name with an asterisk character:
delivery mbox * mbox-file-name
With | specified, name is interpreted as a regular shell command, and there's no reason it can't contain further pipes and redirects:
delivery mbox | filter_one | filter_two 2>/dev/null >> output
Also note that the delivery statement applies formatting to its arguments, so you can easily do things like:
delivery mbox ~/Mail/mbox-%s-%g
SMTP is a network protocol whereby newstap will connect to a server and deliver messages. If you specify host, followed by a bang, before the address (no space may occur within host or between it and the bang), that host will be the one that newstap connects to. Otherwise, if your target address contains a hostname (i.e. user@host), that will be the host that newstap connects to. If no host is specified in either way, newstap will by default connect to localhost.
address may either be a ``bare address'', such as trickey or
trickey@foo.bar, or it may be an RFC 822-style full name plus address,
such as Aaron Trickey
Examples:
# Deliver to my local account (most likely usage)
delivery smtp trickey
# Deliver to a specific account, with a prettier To: line
delivery smtp mail.mydomain.dom!Typical User <tuser@mydomain.com>
In certain places, newstap lets you specify a so-called `formatted string'. This is a piece of text which can contain special `formatting codes' that get replaced with different values.
Code Replaced With
---- -------------
%% %
%d The current time and date, in a standard format
%g The current group name
%h The local hostname
%s The current server name
%u The user name under which I<newstap> is running
%v The name and version of the program (e.g. newstap 0.9.2)
So, for example, you might have
add_header X-From-Newsgroup news://%s:%p/%g
which might result in something like
X-From-Newsgroup: news://news.foo.bar:119/alt.os.linux
newstap will return a nonzero (error) status code if it couldn't load or parse its configuration file or if it encountered any other errors. Otherwise it will return zero.
The following .newstaprc file demonstrates a few of the software's features:
# Filter all my news messages via procmail, just like my email
delivery message | procmail
nntp news.freshmeat.net {
group fm.announce
}
# Note: news.example.com doesn't really exist....
nntp news.example.com {
auth My-User-Name My-Password
# This server keeps articles forever; when I add a new group,
# just catch up the 100 most recent ones:
initially newest 100 articles
group comp.lang.lisp
group comp.std.c++
}
Lacking support for nontrivial NNTP authentication or secure NNTP transport. Completely single-threaded. No way to configure it except via the config file syntax.
Similar in spirit, and a source of inspiration: fetchmail(1)
How I process messages I retrieve: procmail(1)
How I read those messages: mutt(1)
RFC 977 - Network News Transfer Protocol
RFC 822 - Standard for the Format of ARPA Internet Text Messages
Aaron Trickey <amtrickey@users.sourceforge.net>
When initially looking for a quick way to grab news articles into procmail(1), I came across a small Perl program called `fetchnews' that did the job. Well, I started cleaning it up, fixing some bugs, and adding some features, but got carried away and decided to rewrite it, as much for fun as for functionality. Hence newstap. Fetchnews is available at <http://files.moo.ca/~laotzu/fetchnews.html>, and was written by Matthieu Fenniak.