What Do YOU Store In Your WordPress Options? Spam!

Posted on Thu, 16th March 2006 at 12:35 under Software, Politics, Hmmmm..., Skills, WordPress, Education, Coding

How about the full text of many recent posts on several of the WordPress developer’s blogs? Without your knowledge or permission? Does that sound like a valuable option?

You will need: SQL knowledge or interest, geekiness, a cup of coffee.
You should have: A WordPress blog, PHPMyAdmin, half an hour.
My intent: poke fun at WordPress developers, for poor customer service, again.
Your reward: entertainment, enlightenment, knowledge.


Introduction

What is an option? It is a choice, implicitly from many available choices.

What if you get no choice? Is that still an option? No.

What if you had no choice but to store several hundred kilobytes of unnecessary posts from strangers on your blog? Is that an option, or is it spam? Spam.

Those are my answers. Get your own. Here’s my working.

Get Your Database At The Ready

Got your WordPress database ready to be queried? If so, you’re ready for the first question.

Question 1. How Many Options Do I Have?

Your table names may differ. The ones used here are the WordPress defaults. The answers are real.

SELECT
  COUNT(*)
FROM
  wp_options
COUNT(*)
75

WordPress offers me 75 different options from which to choose. That’s a lot.

Question 2. What Is An Option?

I can’t tell whether 75 is a good, bad or indifferent number of options to have. I need to know what an option looks like.

SELECT
  *
FROM
  wp_options
LIMIT
  0, 3

No sense in asking for them all! I only need a few to begin with. I don’t care which ones they are.

option_id option_name option_value option_description
1 siteurl http://www.libertini.net/libertus siteurl is your blog’s URL: for example, ‘http://example.com/wordpress’ (no trailing slash !)
2 blogfilename index.php blogfilename is the name of the default file for your blog
3 blogname libertus blogname is the name of your blog

Note: I have elided some columns from the result.

Among other things, an option, according to WordPress, has a name, a value and a description. What the description is doing in the database I cannot imagine, because it does not change, but it has to go somewhere I suppose.

So there are 75 names and values for options in my WordPress database. Names can be a maximum of 64 characters long, but values, for some reason, can be any length at all.

Question 3. What Is The Average Size Of An Option Value?

Do options really need to be any length at all? URLs can be pretty long, so perhaps 100 or so characters would be sufficient. An option can only be supplied by the user, after all, so it has to be something they put in a text field, or perhaps uploaded from a file.

I don’t recall uploading anything from a file into my WordPress options. Nor do I recall typing all that much at all. So the options shouldn’t be all that big.

SELECT
  AVG(LENGTH(option_value)) AS Average
FROM
  wp_options
Average
1841.0667

I BEG YOUR PARDON?! The values of the 75 options average 1.8Kb each? WTF?!

Question 4. How Much Optional Value Do I Have?

SELECT
  SUM(LENGTH(option_value)) AS Optional
FROM
  wp_options
Optional
138080

138Kb!!! I certainly didn’t ask for that much! Something has to be wrong. That much data in an options table with 75 entries must be an error. Worse still, there could be, horror of horrors, database corruption.

Question 5. What Options Have The Largest Values?

I need to assess the state of things here. How deep does the problem go? How corrupt is my database? I need the top ten worst culprits.

SELECT
  option_id AS id,
  option_name AS name,
  LENGTH(option_value) AS length,
  option_description AS why
FROM
  wp_options
ORDER BY
  length DESC
LIMIT
  0, 10
id name length why
121 rss_867bd5c64f85878d03a060509cd2f92c 118462
119 rss_0ff4b43bd116a9d8720d689c80e7dfd4 17068
117 rss_8ad6720aa1000f718d774ab9f25f1f98 1364
112 page_uris 396
107 recently_edited 244
102 active_plugins 99
96 permalink_structure 36 How the permalinks for your site are constructed. See permalink options page for necessary mod_rewrite rules and more information.
1 siteurl 33 siteurl is your blog’s URL: for example, ‘http://example.com/wordpress’ (no trailing slash !)

Only 7 results are needed to know that the corruption is localised. After 8 results, one of the first options I looked at has returned. Strange, though, how the largest options, including the WHOPPER at 118Kb, have no explanation for their existence whatsoever. The top three especially.

It is reasonable to expect the one thing that consumes more than everything else combined to offer at least some raison d’être. In this case, there are three letters, rss followed by nonsense. Why is there so much rss in my options? I did not ask for it nor do I recognise those random sequences of characters. I did not put them there.

Question 6. What Is The Value Of RSS In The Options?

There’s a lot, so there has to be some value in storing it. Tentatively, look at the smallest one.

SELECT
  option_value
FROM
  wp_options
WHERE
  option_id = 117

Unexpurgated. I kid you not.

O:9:”magpierss”:17:{s:6:”parser”;i:0;s:12:”current_item”;a:0:{}s:5:”items”;a:0:{}s:7:”channel”;a:8:{s:9:”generator”;s:15:”Technorati v1.0″;s:9:”webmaster”;s:43:”support@technorati.com (Technorati Support)”;s:4:”docs”;s:37:”http://blogs.law.harvard.edu/tech/rss”;s:3:”ttl”;s:2:”60″;s:5:”title”;s:57:”Technorati Search for: http://www.libertini.net/libertus/”;s:4:”link”;s:77:”http://www.technorati.com/search/http%3A%2F%2Fwww.libertini.net%2Flibertus%2F”;s:11:”description”;s:62:”Technorati URL search for http://www.libertini.net/libertus/. “;s:7:”tagline”;s:62:”Technorati URL search for http://www.libertini.net/libertus/. “;}s:9:”textinput”;a:4:{s:5:”title”;s:17:”Search Technorati”;s:11:”description”;s:43:”Search millions of blogs for the latest on:”;s:4:”name”;s:1:”s”;s:4:”link”;s:36:”http://www.technorati.com/search.php”;}s:5:”image”;a:3:{s:3:”url”;s:50:”http://static.technorati.com/pix/logos/logo_sm.gif”;s:5:”title”;s:15:”Technorati logo”;s:4:”link”;s:25:”http://www.technorati.com”;}s:9:”feed_type”;s:3:”RSS”;s:12:”feed_version”;s:3:”2.0″;s:5:”stack”;a:0:{}s:9:”inchannel”;b:0;s:6:”initem”;b:0;s:9:”incontent”;b:0;s:11:”intextinput”;b:0;s:7:”inimage”;b:0;s:13:”current_field”;s:0:”";s:17:”current_namespace”;b:0;s:19:”_CONTENT_CONSTRUCTS”;a:6:{i:0;s:7:”content”;i:1;s:7:”summary”;i:2;s:4:”info”;i:3;s:5:”title”;i:4;s:7:”tagline”;i:5;s:9:”copyright”;}}

This is a serialized PHP variable with a lot of information, about my blog and about something called technorati, which I don’t use, so I don’t need 1.7k of RSS feed definition hanging around.

That doesn’t explain why I have 118k of RSS feed defintion hanging around though. What could possibly be in the whopper?

Question 7. May I Have A Small Piece Of That Whopper, Please?

I do not want the whole thing! Just enough to get a feel for what it contains. Just the last two kilobytes.

SELECT
  RIGHT(option_value, 2000) AS endpiece
FROM
  wp_options
WHERE
  option_id = 121

Endpiece

column layout, while individual articles are two-column. The category and archive listings are nice and compact, and there’s an easy-to-spot “Subscribe” link in the corner of the header. The main additions I’d like to see would be a search box, and an “About” page.
</p>

Oh, and it’s powered by WordPress.

“;}i:59;a:6:{s:5:”title”;s:64:”Weblog Tools Collection: Thesis on the Social Phenomena of Blogs”;s:4:”guid”;s:93:”http://weblogtoolscollection.com/archives/2005/09/15/thesis-on-the-social-phenomena-of-blogs/”;s:4:”link”;s:93:”http://weblogtoolscollection.com/archives/2005/09/15/thesis-on-the-social-phenomena-of-blogs/”;s:11:”description”;s:216:”

Thesis on the Social Phenomena of Blogs: Interesting and detailed read. Seasoned bloggers might find the information to be a tad cliche’d.

“;s:7:”pubdate”;s:31:”Thu, 15 Sep 2005 18:00:17 +0000″;s:7:”summary”;s:216:”

Thesis on the Social Phenomena of Blogs: Interesting and detailed read. Seasoned bloggers might find the information to be a tad cliche’d.

“;}}s:7:”channel”;a:5:{s:5:”title”;s:16:”WordPress Planet”;s:4:”link”;s:28:”http://planet.wordpress.org/”;s:8:”language”;s:2:”en”;s:11:”description”;s:47:”WordPress Planet - http://planet.wordpress.org/”;s:7:”tagline”;s:47:”WordPress Planet - http://planet.wordpress.org/”;}s:9:”textinput”;a:0:{}s:5:”image”;a:0:{}s:9:”feed_type”;s:3:”RSS”;s:12:”feed_version”;s:3:”2.0″;s:5:”stack”;a:0:{}s:9:”inchannel”;b:0;s:6:”initem”;b:0;s:9:”incontent”;b:0;s:11:”intextinput”;b:0;s:7:”inimage”;b:0;s:13:”current_field”;s:0:”";s:17:”current_namespace”;b:0;s:19:”_CONTENT_CONSTRUCTS”;a:6:{i:0;s:7:”content”;i:1;s:7:”summary”;i:2;s:4:”info”;i:3;s:5:”title”;i:4;s:7:”tagline”;i:5;s:9:”copyright”;}s:13:”last_modified”;s:31:”Fri, 30 Sep 2005 13:00:30 GMT
“;s:4:”etag”;s:34:”"4a01c5-11f9b-433d36ee;432f77d7″
“;}

Tags? Conversation? There’s a discussion going on in my options! A lot of it that I didn’t ask to be put there. What the hell is going on?

I don’t care.

DELETE
  FROM wp_options
WHERE
  option_id IN (117, 119, 121)

Problem solved. I hate spam.

27 Responses

  1. Only temporary reprieve. Log out and log back in again and whoops! they are back.

    Theres something lurking in the code that reloads them.

    Reply
  2. Get rid of the other rss_longnumber entries as well. that seems to stick it to them!

    Reply
  3. Wordpress does not provide any means in the admin pages for you to amend the rss feed from wordpress.

    If you want to switch off the WP development blog feed go to your wp_yourdbname_options and delete the rows with option_name = rss_longnumber and rss_longnumber_ts

    If you dont know SQL and you dont know how to look under the covers then you will not be able to do this. The WP guys are banking on this to keep their stats going.

    Reply
  4. Liberta said:

    The WP guys are banking on this to keep their stats going.

    You might think so. I couldn’t possibly comment.

    Reply
  5. Let me just clarify what I wrote earlier. The deleting works on the WP blog I have running version 1.5. It does not work on the WP blog I have running version 2.0.

    Verison 2.0 makes it more difficult to remove WP development blog rss feeds. Difficult is not impossible.

    Reply
  6. Difficult is not impossible.

    I take it you won’t be requiring any assistance? ;)

    Reply
  7. Further Investigation

    They asked for this. It is fucking bad. Thank goodness for open source. If I found out this was happening in a commercial product, I would be outraged.

    WordPress blog owners have no choice about taking the Technorati (incoming links) and WordPress (spam) RSS feed content. It cannot be switched off. The taking of the Dashboard feeds is hard-coded into WordPress.

    Further, the entire RSS library used to do this is ONLY used by the Dashboard. In other words, if the Dashboard feeds were not imposed on WordPress blog owners, the rss-functions.php library could be dropped entirely. The proof? Search the WordPress code for rss-functions.php. It is mentioned once, at the top of file wp-admin/index.php. The fetch_rss function is called three times by that file.

    $rss = @fetch_rss(’http://feeds.technorati.com/cosmos/rss/?url=’. trailingslashit(get_option(’home’)) .’&partner=wordpress’);
    $rss = @fetch_rss(’http://wordpress.org/development/feed/’);
    $rss = @fetch_rss(’http://planet.wordpress.org/feed/’);

    The first asks Technorati for incoming links to your blog. It would be completely innocuous if not for the partner=wordpress. Also, if you do not use Technorati, it is useless. If you do not want to use Technorati, tough.
    The second asks WordPress for the development news feed. Useful for learning about WordPress updates, unless you check their website regularly.
    The third asks WordPress for the commercial news feed. Useful for learning about the WordPress community, unless you don’t care, or check their website regularly.

    Why is your blog programmed to unconditionally acquire, store and present this information for you, dear WordPress blog owner? Do you want it? Did you ask for it? Is it of any benefit to you, whatsoever? There is a link to WordPress on every page of your site. Could you not find the same information, voluntarily, through that?

    To make matters worse, rss-functions.php is HEAVY. It loads the XML library and parses XML. It has its own caching functions. In short, it is bloody enormous and it is loaded every time the Dashboard is viewed.

    All to process, store and display several hundred kilobytes of spam. Disgraceful!

    Reply
  8. Replacing The WordPress Dashboard With A Plugin

    Fortunately, with a little ingenuity, the WordPress Dashboard privacy invasion system can be removed. The menu system can be plugged into using the admin-menu action. The plugin can search for the menu item and redirect the browser to a better Dashboard generated by the same plugin.

    Code is poetry, and I like to write poetry off the top of my head.

    <?php
    ?>

    There’s that scary start again! This time, though, the fact that this PHP program must be a WordPress plugin totally drives the writing of the code. Making a PHP program into a WordPress plugin means following a form. Not just any old code will do.

    <?php
    /*
    Plugin Name: Zeitgeist
    Plugin URI: http://www.libertini.net/libertus/zeitgeist/
    Description: Replacement for the WordPress.org “Dashboard” privacy invasion system. Zeitgeist tells *you* what is happening on *your* blog rather than what is happening with WordPress.org. Unlike Dashboard, Zeitgeist does *not* tell WordPress.org what *you* are doing.
    Version: 0
    Author: Paul Mitchell aka Libertus
    Author URI: http://www.libertini.net/libertus/
    */
    ?>

    That’s a good start for a WordPress plugin. Just comments. From now on, It’ll just be code.

    Reply
  9. Participate In The Administration Of WordPress

    All the plugin has to do is add an action of type admin_menu specifying which function will be called. My function is called zeitgeist.

    add_action( 'admin_menu', 'zeitgeist' );
    
    function zeitgeist() {
    }

    I’m in! What I want to do is detect if the current page is the admin index page and, if so, redirect output from this plugin. The WordPress menu system supports this, but I want to be subversive, not co-operative.

    The first thing I want to do is eliminate the Dashboard from the menu system entirely. To do that, I need direct access to the menu storage, which is in a global variable $menu. I know the Dashboard is element zero in the menu.

    global $menu;
    $menu[0][0] = 'Zeitgeist';
    $menu[0][2] = 'zeitgeist.php';

    Next, the subversion. index.php runs admin.php before displaying the hard-coded Dashboard. Just after running the admin menu system, a check is made to see whether a plugin page was requested via a URL parameter. I want to make WordPress think one was, even though one wasn’t.

    $_GET['page'] = 'zeitgeist.php';

    If this code is successful, instead of displaying the Dashboard, WordPress will display the admin page for my plugin. Which doesn’t exist yet, but that’s OK. It will soon.

    Time for some local testing. :blush: I seem to have forgotten the admin password for my local blog. Unfortunately, my admin email address is “donald@duck.com”. I’ll have to look at the database.

    UPDATE wp_users SET user_email=’paul@localhost’ WHERE user_login = ‘admin’

    Reply
  10. Only The WordPress Dashboard, Please

    My plugin changes the menu to read Zeitgeist but redirects all admin pages to a non-existent one! It needs to be more selective. Only when the currently executing script is /wp-admin/index.php. The name of the currently executing script is held in the PHP superglobal $_SERVER[’PHP_SELF’]. I’d like to take a look at what’s in it.

    add_action( 'admin_menu', 'zeitgeist' );
    
    function zeitgeist() {
    
      global $menu;
      $menu[0][0] = 'Zeitgeist';
      $menu[0][2] = 'zeitgeist.php';
    
      $_GET['page'] = 'zeitgeist.php';
    
      echo $_SERVER['PHP_SELF'];
    
    }

    I get /~paul/blog/wp-admin/index.php for the Dashboard, so my conditional is pretty easy. Test that the final 19 characters of the script name are the same as for the Dashboard, and redirect only if that is the case.

    if( '/wp-admin/index.php' == substr($_SERVER['PHP_SELF'],-19,19) )
      $_GET['page'] = 'zeitgeist.php';
    

    Success! The Dashboard is gone. No more privacy-invading RSS feeds. Unfortunately, also no clue about what is going on. The recent posts and comments are no more, gone with the Dashboard. That won’t do!

    Reply
  11. Zeitgeist Is Coming…

    The eagle-eyed among you may already have noticed that I’ve been using something called Zeitgeist for some time now. I depend on it to effectively manage my blog.

    So, the simplest solution is to take what I already have and stick it into this plugin as a management page. That means WordPress has to be subverted a little further. I don’t want to add a page, just output something for a page I have replaced, so it needs a function name.

    WordPress decides how to call the function like so:

    $plugin_page = stripslashes($_GET['page']);
    $plugin_page = plugin_basename($plugin_page);
    $page_hook = get_plugin_page_hook($plugin_page, $pagenow);
    
    ...
    
    do_action($page_hook);

    If I know the hook, I know the action. If I know the action, I can add it only rather than a whole menu item. Echo it out.

    There isn’t one, but that’s even better. If there isn’t a known page hook, WordPress just includes the plug-in file over again. I can detect that and just spit out the page, no worries.

    Reply
  12. Here’s Something I Prepared Earlier

    Hack! Slash! Twiddle! Lo and behold! Zeitgeist lives!

    Reply
  13. Conception Precedes Birth

    Not very stylish. With a touch of spit and polish, it becomes Zeitgeist 0.1, a real plugin.

    As an expression of confidence, I’m committing it to CVS. Zeitgeist 0.1 is something I can build upon in future, for sure.

    That done, proceed to test on the Windows blog, which has more content, and already runs a similar system but by hand-coding rather than by plug-in. In fact, it runs a completely different version of WordPress.

    Reply
  14. Teething Troubles

    Q: What is the purpose of testing?
    A: Finding those embarrasing or painful problems before they fuck up anything important.

    Q: What is the most irritating cause of problems during testing?
    A: Fucking cut and paste coding, that’s what!

    I break my own rule and it comes right back and bites me. Zeitgeist by default sets a period of interest that begins now and stops two weeks ago. There is a parameter that the prototype passes itself, by way of a More… link, that extends the period of interest to a month and the top 16 posts rather than the default top 8.

    Unfortunately, the construction of the URL for the link depended upon the precise way the prototype had been incorporated into WordPress. A plugin is executed differently and mine is is broken because it is still partly the prototype, which wasn’t a plugin. I have to ensure the period of interest parameter, which I shall now simply name zeitgeist, is correctly incorporated into the page URL.

    While doing that, I remember that the moderator needs to know when comments are awaiting moderation. I need to add that. Mental note: remember future posts. Mental note: remember spam stats.

    Reply
  15. Zeitgeist Plugin For WordPress 0.2 Goes Live, Privately

    This has all been about privacy, whatever you think that is. I have a very strict view of privacy, wanting to know nothing I need not, and expecting the same of others.

    WordPress and Technorati do not need to know about every WordPress blog I run. I do not need to know what Technorati and WordPress are doing.

    So at least one blog they thought they knew about is going silent. It will be dead to them. It is a blog where privacy matters to people.

    Reply
  16. Zeitgeist Plugin for WordPress 0.2 Goes Live, Publically

    On this blog. I had been running the prototype for months anyway. The plugin is just nicer. Easier to maintain and update for starters.

    Reply
  17. Can I have a copy too please?

    Reply
  18. Liberta,

    Your order for Zeitgeist Plugin for WordPress version 0.2 has been dispatched via e-mail. I look forward to hearing about your experiences with it.

    Reply
  19. So at least one blog they thought they knew about is going silent. It will be dead to them. It is a blog where privacy matters to people.

    This is sooo funny, but for all the wrong reasons… I hope you enjoy the privacy of your public weblog. It will last, oh, until early next week when google will have indexed it. If you want to keep things private, don’t post them on the web.

    Reply
  20. ROBOTS.TXT

    User-agent: *
    Disallow: /

    Now skugg, is Google going to index my private site, or not? Public access does not mean public viewing. If Google disobeys the robot exclusion, will it get any of the private data? Will anyone know about the private site if I don’t tell them it exists?

    Don’t post on the web? Did you read about my local blog installation for testing, which STILL informed WordPress about its existence, even though impossible to access from outside?

    Google, although all-powerful and all-seeing, is polite enough to ask permission.

    Reply
  21. google might obey robots.txt directives, but there are loads of spiders that will not.

    Don’t post on the web? Did you read about my local blog installation for testing, which STILL informed WordPress about its existence, even though impossible to access from outside?

    Would you rather the wordpress people spent developer hours working on this edge case (adding code paths that would only be used if the blog is not publicly viewable) or implement features that you actually use?

    PS: Wordpress, like all free software products, comes with a money-back guarantee; if you’re so unhappy with the product, I suggest you use it.

    Reply
  22. Skugg,

    Please post the link to the official WordPress money-back guarantee.

    I would rather the WordPress developers spend time developing their product, not pulling in useless RSS feeds, spamming my database and monitoring the activity on my blogs.

    All they need to add is a switch to the options such as “Disable WordPress spam feeds” or “Stop this blog telling Technorati it exists”, or perhaps some more marketing-friendly text.

    Spyware is spyware, whether it is free or commercial. No-one should be allowed to get away with spying on users or marketing to them without permission.

    Reply
  23. Zeitgeist Evolves

    Version 0.2 has been live on a few blogs for a few days. So far there have been no reports of any problems, which is pleasing. That gives have a known stable base to fuck up! Yippee!

    Goal

    Produce saleable product. The Zeitgeist plug-in for WordPress is not free or open source software, despite being delivered in source form. Zeitgeist is good enough to be worth five bucks, per blog, per version. Zeitgeist saves that in time alone in less than a week. Best of all, Zeitgeist is developed by time-served software craftsmen, still valued by some these days, rather than wizards, which are cheap and freely available.

    Feature List

    Task; reading, participating, working, moderating
    Focus; posts, comments, people, categories, schedule, moderation
    Interest; time period, eliminate self, responses
    Extras; excerpts, edit links, stay back-stage, refresh and page ageing, accessibility
    Feeds; simple RSS aggregator
    Style; pick out user levels, place divisions
    Setup; only what you want, named groups of settings
    Help; write, link

    Technical Changes

    Structure; functions
    Database; design new and improve existing SQL queries, lighten as much as possible
    Calculation; adapt for new incoming data
    Output; tidy up, plurals, write new bits

    Reply
  24. The rss feeds may well me: from WP the Development Blog, which is of some use for updates and security use. I’d leave that one in there. The technorati one is used in the Links Coming In section and goes straight to your ranking on technorati’s homepage. That would be a matter of choice, but the Links Coming In shows who has commented or referred to your blog on their’s, also of use to some people.

    Reply
    1. I don’t doubt the utility of the development blog feed, but it so rarely changes that there’s no point in requesting an update nearly every time the Dashboard is loaded.

      My Zeitgeist plugin emulates the “Latest Activity” box from the Dashboard because other plugins use it to communicate. So that people could also choose to use the Incoming Links section, I bundle a mini-plugin with Zeitgeist that puts it back.

      I’m hoping that WP 2.1 will have a pluggable Dashboard.

      Reply
  25. In the course of moving my blog I took a look at the MySQL database for the first time and found the options table “spam” you refer to.

    This is the only place on the web I’ve been able to find a reerence to the issue (certainly it’s nowhere on the WP support forums that I’ve found so far.)

    Am appalled and about to drop wordpress in favor of textpattern —

    Thanks for putting this information out there. Especially since on one else will!

    Reply
    1. You’re welcome and I’m glad you found it of some use. Perhaps no-one else considers caching RSS feeds in an options table to be a problem?

      Reply

Leave a Reply

You may also log in to post a comment.

XHTML:

If you want to <q>tag</q>, please balance these; a, i, em, b, strong, u, blockquote, q, ul, li, ol, abbr, code, pre, sub and sup.