Help Create a New RSS Feed
Attention software programmers, techies, web gurus, etc. JesusitaFire.org is looking for your assistance in writing a webpage–>XML/RSS scraper of the official County website. Email liam@jesusitafire.org if you would like to help.
Right now, if you want the most current info about the fire, you need to manually refresh the County’s website. They have an RSS feed for their official press releases but they do not have one for their evacuation orders (this is an example of when it would be helpful for the county to use a modern CMS).
It would be much better if the latest evacuation orders were available as an RSS feed that people could subscribe to. It would also be useful if people could receive notifications of new items added by email or SMS. So that’s what we’re trying to do. Our platform for this is http://evac.jesusitafire.org/.
Right now, that site is being powered by this Yahoo Pipe, with the greatly appreciated help of Paul Daniel, and its feed as RSS is available here. The problem is that Yahoo limits their updates to once every 2 hours. Clearly, this is not fast enough.
We also built a feed using Feed43. However, at this time that feed only updates every 6 hours, which is even worse. If we paid $17, Feed43 pro would update every 1 hour. We thought about doing that, but even one hour does not seem fast enough. What we need is something that will scrape the page every 10 minutes or so.
So now we are working to write a page scraper in PHP that will check the county website every 10 minute and will publish an update via RSS if it detects a change.
Examining the source of the county website, we have created the following pattern for scraping:
<span style="color:#0080c0; font-weight:bold" >  Date Last Updated:{%1}</span>{*}<a class=”bookmark” id=”Evacuation_Warnings” title=”Evacuation_Warnings” name=”Evacuation_Warnings”></a>{%2}<span class=”body_black”>Definitions</span>
In this pattern, {%N} represents an area of information we want to extract, and {*} is an area we want to skip over and ignore.
The RSS feed should look something like:
- Title: Latest evacuation info as of {%1}
- Link: http://www.countyofsb.org/ceo/oes0.aspx?id=2332
- Date published: {%1}
- Description: {%2}
So if you know PHP or another way to do this and want to give us a hand, send an email or leave a comment below. Thanks!








