Blame
Date:
Tue Dec 21 08:11:28 2021 UTC
Message:
Added README to install dependencies
001
2021-12-17
jrmu
================================================================================
002
2021-12-17
jrmu
003
2021-12-17
jrmu
Challenge
004
2021-12-17
jrmu
005
2021-12-17
jrmu
In this challenge, we will modify our original rssbot.pl to include 7
006
2021-12-17
jrmu
preloaded RSS feeds. For example, to get the headlines from the IRCNow
007
2021-12-17
jrmu
Almanack, the user would type: !ircnow
008
2021-12-17
jrmu
009
2021-12-17
jrmu
We also need to add the ability for the user to add and delete RSS feeds.
010
2021-12-17
jrmu
To add an RSS feed, a user can type !add name URL
011
2021-12-17
jrmu
To delete an RSS feed, a user can type !delete name URL
012
2021-12-17
jrmu
013
2021-12-17
jrmu
Finally, the old RSS bot displayed every single article in the RSS feed.
014
2021-12-17
jrmu
Some feeds can be very long with hundreds of articles in them. Let's
015
2021-12-17
jrmu
update the bot so it only displays 5 items at a time.
016
2021-12-17
jrmu
017
2021-12-17
jrmu
================================================================================
018
2021-12-17
jrmu
019
2021-12-17
jrmu
Modifying rssbot.pl
020
2021-12-17
jrmu
021
2021-12-17
jrmu
We're going to change the name of RSSBot to NewsBot, so the filenames
022
2021-12-17
jrmu
will change from rssbot.pl to newsbot.pl.
023
2021-12-17
jrmu
024
2021-12-17
jrmu
Next, we're going to replace the scalar $url with the hash %feedURLs
025
2021-12-17
jrmu
so we can download from multiple RSS feeds:
026
2021-12-17
jrmu
027
2021-12-17
jrmu
--- /home/perl103/rssbot.pl Tue Aug 31 04:59:42 2021
028
2021-12-17
jrmu
+++ /home/perl104/newsbot.pl Wed Sep 1 10:38:42 2021
029
2021-12-17
jrmu
@@ -6,21 +6,66 @@
030
2021-12-17
jrmu
use base qw(Bot::BasicBot);
031
2021-12-17
jrmu
use XML::RSS::Parser;
032
2021-12-17
jrmu
033
2021-12-17
jrmu
-my $url = 'https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss';
034
2021-12-17
jrmu
+my %feedURLs = (
035
2021-12-17
jrmu
+ "undeadly" => "http://undeadly.org/cgi?action=rss",
036
2021-12-17
jrmu
+ "eff" => "https://www.eff.org/rss/updates.xml",
037
2021-12-17
jrmu
+ "hackernews" => "https://news.ycombinator.com/rss",
038
2021-12-17
jrmu
+ "krebs" => "https://krebsonsecurity.com/feed",
039
2021-12-17
jrmu
+ "ircnow" => "https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss",
040
2021-12-17
jrmu
+ "schneier" => "https://www.schneier.com/blog/atom.xml",
041
2021-12-17
jrmu
+ "slashdot" => "http://rss.slashdot.org/Slashdot/slashdotMain",
042
2021-12-17
jrmu
+ "theregister" => "https://www.theregister.com/headlines.rss",
043
2021-12-17
jrmu
+);
044
2021-12-17
jrmu
045
2021-12-17
jrmu
The keys for %feedURLs are the names of the news sites, and the values
046
2021-12-17
jrmu
are the URLs of the RSS feeds.
047
2021-12-17
jrmu
048
2021-12-17
jrmu
Inside the subroutine said, we need to check for two new commands,
049
2021-12-17
jrmu
!add and !delete, plus the rss feed itself.
050
2021-12-17
jrmu
051
2021-12-17
jrmu
sub said {
052
2021-12-17
jrmu
my $self = shift;
053
2021-12-17
jrmu
my $arguments = shift;
054
2021-12-17
jrmu
- if ($arguments->{body} =~ /^!rss/) {
055
2021-12-17
jrmu
+ if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) {
056
2021-12-17
jrmu
+ my ($name, $url) = ($1, $2);
057
2021-12-17
jrmu
+ $feedURLs{$name} = $url;
058
2021-12-17
jrmu
+ $self->say(
059
2021-12-17
jrmu
+ channel => $arguments->{channel},
060
2021-12-17
jrmu
+ body => "$name added.",
061
2021-12-17
jrmu
+ );
062
2021-12-17
jrmu
+ }
063
2021-12-17
jrmu
064
2021-12-17
jrmu
We first check to see if the user typed !add <name> <url>. Here, we use
065
2021-12-17
jrmu
perl regular expressions (regex for short) to see if the user typed in
066
2021-12-17
jrmu
a valid feed name and URL.
067
2021-12-17
jrmu
068
2021-12-17
jrmu
NOTE: It is very important to check that data is valid. If you don't,
069
2021-12-17
jrmu
it can become a source of security holes which attackers can use to
070
2021-12-17
jrmu
steal control of your program.
071
2021-12-17
jrmu
072
2021-12-17
jrmu
Let's take a closer look at the if condition:
073
2021-12-17
jrmu
074
2021-12-17
jrmu
+ if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) {
075
2021-12-17
jrmu
076
2021-12-17
jrmu
We check if the message $arguments->{body} fits the right format. It must
077
2021-12-17
jrmu
begin with the string !add, followed by one or more whitespace characters,
078
2021-12-17
jrmu
then http:// or https://, then one or more printing characters up to the
079
2021-12-17
jrmu
end of the string. The feed name is captured in $1 and the URL is captured
080
2021-12-17
jrmu
in $2.
081
2021-12-17
jrmu
082
2021-12-17
jrmu
If the IRC message matches our regex, we then store the name and URL as
083
2021-12-17
jrmu
a key-value pair in our hash %feedURLs, with the name as key and the URL
084
2021-12-17
jrmu
as value. We then send a message to the channel saying that $name has
085
2021-12-17
jrmu
been added.
086
2021-12-17
jrmu
087
2021-12-17
jrmu
In the next block, we check to see if the user typed
088
2021-12-17
jrmu
!delete <username> <email>
089
2021-12-17
jrmu
090
2021-12-17
jrmu
+ if ($arguments->{body} =~ m{^!delete\s+(\w+)$}) {
091
2021-12-17
jrmu
+ my $name = $1;
092
2021-12-17
jrmu
+ delete($feedURLs{$name});
093
2021-12-17
jrmu
+ $self->say(
094
2021-12-17
jrmu
+ channel => $arguments->{channel},
095
2021-12-17
jrmu
+ body => "$name deleted.",
096
2021-12-17
jrmu
+ );
097
2021-12-17
jrmu
+ }
098
2021-12-17
jrmu
099
2021-12-17
jrmu
If it matches our regular expression, we delete the key-value pair
100
2021-12-17
jrmu
from %feedURLs and then send a message to the channel.
101
2021-12-17
jrmu
102
2021-12-17
jrmu
Now, if a user sends any other command, we check to see if a key-value
103
2021-12-17
jrmu
pair is defined for the feed:
104
2021-12-17
jrmu
105
2021-12-17
jrmu
+ if ($arguments->{body} =~ /^!(\w+)$/) {
106
2021-12-17
jrmu
+ my $name = $1;
107
2021-12-17
jrmu
+ if (!exists($feedURLs{$name})) {
108
2021-12-17
jrmu
+ $self->say(
109
2021-12-17
jrmu
+ channel => $arguments->{channel},
110
2021-12-17
jrmu
+ body => "Error: $name has not been added",
111
2021-12-17
jrmu
+ );
112
2021-12-17
jrmu
+ return;
113
2021-12-17
jrmu
+ }
114
2021-12-17
jrmu
115
2021-12-17
jrmu
If none is defined, we send a message to the channel showing an
116
2021-12-17
jrmu
error.
117
2021-12-17
jrmu
118
2021-12-17
jrmu
If a URL is defined for the feed, then we create a new XML::RSS::Parser
119
2021-12-17
jrmu
object. We're going to replace the old foreach loop because the old
120
2021-12-17
jrmu
loop printed out every single item in an RSS feed. Some of the new feeds
121
2021-12-17
jrmu
we add have hundreds of articles; a for loop allows us to limit the
122
2021-12-17
jrmu
articles to 5 per feed.
123
2021-12-17
jrmu
124
2021-12-17
jrmu
my $p = XML::RSS::Parser->new;
125
2021-12-17
jrmu
+ my $url = $feedURLs{$name};
126
2021-12-17
jrmu
my $feed = $p->parse_uri($url);
127
2021-12-17
jrmu
- foreach my $i ( $feed->query('//item') ) {
128
2021-12-17
jrmu
- my $title = $i->query('title');
129
2021-12-17
jrmu
- my $contributor = $i->query('dc:contributor');
130
2021-12-17
jrmu
- my $link = $i->query('link');
131
2021-12-17
jrmu
132
2021-12-17
jrmu
In the code below, we first find the feed's title, then loop through
133
2021-12-17
jrmu
each item in the feed using a for loop. We start with index $i = 0 and
134
2021-12-17
jrmu
stop when we have printed all items or after we have finished 5, whichever
135
2021-12-17
jrmu
comes first. Each time through the loop, we increment (add one) to $i.
136
2021-12-17
jrmu
137
2021-12-17
jrmu
+ my $qtitle = $feed->query('/channel/title');
138
2021-12-17
jrmu
+ my $feed_title = $qtitle->text_content;
139
2021-12-17
jrmu
+ my @qitems = $feed->query('//item');
140
2021-12-17
jrmu
+ for (my $i = 0; $i < scalar(@qitems) && $i < 5; $i++) {
141
2021-12-17
jrmu
142
2021-12-17
jrmu
Inside the loop, we store the query for each into $qitem. We create
143
2021-12-17
jrmu
a hash called %item for each item, and we store the feed's title
144
2021-12-17
jrmu
and tags inside. If the tag is undefined, we store an empty string.
145
2021-12-17
jrmu
146
2021-12-17
jrmu
+ my $qitem = $qitems[$i];
147
2021-12-17
jrmu
+ my %item;
148
2021-12-17
jrmu
+ $item{feed_title} = $feed_title;
149
2021-12-17
jrmu
+ foreach my $tag (qw(title dc:contributor link comments)) {
150
2021-12-17
jrmu
+ my $qtag = $qitem->query($tag);
151
2021-12-17
jrmu
+ if(defined($qtag)) {
152
2021-12-17
jrmu
+ $item{$tag} = $qtag->text_content;
153
2021-12-17
jrmu
+ } else {
154
2021-12-17
jrmu
+ $item{$tag} = "";
155
2021-12-17
jrmu
+ }
156
2021-12-17
jrmu
+ }
157
2021-12-17
jrmu
158
2021-12-17
jrmu
We then send a message to the channel, properly formatted, with the feed's
159
2021-12-17
jrmu
title and the value of the tags for each item.
160
2021-12-17
jrmu
161
2021-12-17
jrmu
$self->say(
162
2021-12-17
jrmu
channel => $arguments->{channel},
163
2021-12-17
jrmu
- body => $title->text_content.' - '.$contributor->text_content.': '.$link->text_content,
164
2021-12-17
jrmu
+ body => "[\002$item{feed_title}\002] $item{title} ($item{'dc:contributor'}) $item{link}: $item{comments}",
165
2021-12-17
jrmu
);
166
2021-12-17
jrmu
}
167
2021-12-17
jrmu
}
168
2021-12-17
jrmu
169
2021-12-17
jrmu
Many IRC clients will interpret \002 as a bold character.
170
2021-12-17
jrmu
171
2021-12-17
jrmu
(Hint: sample code is in /home/perl104/newsbot.pl)
172
2021-12-17
jrmu
173
2021-12-17
jrmu
================================================================================
174
2021-12-17
jrmu
175
2021-12-17
jrmu
Username: perl104
176
2021-12-17
jrmu
Password: Hp9XsPhANc6
177
2021-12-17
jrmu
Server: freeirc.org
178
2021-12-17
jrmu
Port: 22
179
2021-12-17
jrmu
180
2021-12-17
jrmu
================================================================================
IRCNow