What Blogs Begat

OK, something I’ve been mulling lately.

In a recent screed, I pretty much outlined my beliefs about blogs:

  • Blogs are cool and fun to do, especially for the word-inclined
  • Most blogs aren’t worth the paper they aren’t printed on except to the people that make them, or a “small circle of friends” (important distinction: Hey, my taste in books is worthless to others. I love my books and like my taste. That sorta thing)
  • While blogs are changing things, they are not creating the earth-shattering upheavals many seem to posit for this medium

Right or wrong, that’s what I believe.

But let’s look at that last point: “…blogs are changing things”

While I don’t believe they will be the end of journalism as we know it today, just as radio or TV didn’t end the types of journalism (print => radio => TV => Web => blog) that came before each, blogs will change and extend journalism.

Each medium extended journalism – and, yes, to the slight detriment of the other conduits. That’s to be expected. But wipe out this conduit or the other? No way. Look at the most ancient conduit: word of mouth.

Still around. Don’t tell me you haven’t seen a movie, read a book, possibly even voted for a politician based primarily upon what someone else has told you.

Right?

OK, we’ve flogged that horse to death – and that’s not what I wanted to say here.

Blogs are changing things, but I think one of the most influential roles blogs – the whole blog movement – are playing is that to facilitate change.

I’ve been reading Dave Winer more closely the last few months, and through him and other folks I read/have corresponded with, I’m seeing a trend of blogs speeding change and standardization – the target is blogs, but it is spilling over to other areas.

Was that obtuse enough? Some examples:

  • I don’t know – and don’t much care – about which of RSS or blogs came first, but blogs are the first place I have really seen RSS take hold. And now these feeds are popping up in places like news.com and other non-blog sites. It’s getting fairly commonplace on tech sites (understandable), but if it’s not a fad it will bleed into the regular Web news sites and then beyond. That’s good.
  • There was an article Dave pointed to today that talked about RSS-to-iPod (Apple’s) software. As other sites – such as CNN – begin to deploy RSS, suddenly your iPod carries CNN headlines. We are reaching that convergence point, and it’s because of blogs…indirectly.
  • The whole concept of journalists posting ALL notes/full transcript of interviews etc on their blogs (sanctioned/required or not by news outfit) may drive things to the point where journalists blog interviews with foreign leaders just like the gang at Boing Boing blog tech conferences. How cool would that be?
  • There seems to be a very strong open-source developer community behind blogs, and a lot of impressive folks (Dave W., Ray Ozzie etc) are putting code or concepts out there; the community responds, BANG! Better software (or, at least, a VERY short “vaporware” interval).

This – to me – is an interesting sidebar to the whole blogsplosion.

Worth noting; worth watching.

As I said sometime earlier, no one is really sure what the whole blog thing will lead to (if anything); it may stay the same or change considerably.

What I didn’t consider was it’s impact on those around and in it.

Me likes.

Power-grid broadband

I read an article a few days ago on ZDNET about the FCC giving at least preliminary approval to powerline-based broadband Net connections.

Interesting, but while this may make broadband more accessible (especially in outlying areas — excellent for remote areas), there are other potential ramifications to this type of Internet distribution that I have not seen addressed:

  • Who gets to do that actual providing of access? Sure, the powerline becomes the pipe, but there has to be an ISP (even if it turns out to be the power company) somewhere to peer into the network backbone, give e-mail addresses etc.
  • Speaking of peering, how will this work? Will the power lines become part of Intenet, or walled off and it’s just like a giant T1 line that goes just to that company’s lines.
  • Power surges?

Actually, at first I was worried about home networking — if every plug is an Internet connection, why would Linksys be needed except for firewall? — but it still requires a modem (the size of a deck of cards, the article says).

Which brings up another point. At what point will ALL modem stuff become standardized — the way Wi-Fi has (such as on PCMCIA cards). So it’s built into the computer, and there are not even small (deck of cards) modems and all?

Or built into the router? Here, I have a cable coming into a modem about the size of paperback book (bigger one). That goes — now all via Ethernet cables – to my router, which only then goes to all machines (so all machines have the firewall, DHCP etc.).

While I understand why the modem is needed, it basically is just a huge box (that requires a power plug) that’s only providing a cable-to-ethernet adaptor.

CSS Testiness

OK, I’m a huge fan of CSS. To paraphrase the Amex commercial, “I don’t build a Web site without it.”

Yet I have complained (from time to time) about shortcomings in CSS, real or perceived.

One complaint I’ve been voicing (under my breath, to myself) recently is the lack of variable scope in CSS.

There aren’t any varibles (much less additional scope) in a CSS file, unless you parse out a NON “.css” file (myCSS.php) and make replacements in this way.

This works, but then it is not as flexible.

Yes yes yeah yeah….CSS is spozed to be this static include file blah blah.

But just as the Web — with it’s static HTML pages — evolved to a very dynamic code-generation system (“No database? That is so 1996..”), one would think that this logic would be available for CSS. Doesn’t appear that it is. CSS is killing me.

Personalization and ease of maintenance calls for CSS to be more dynamic.

Is this possible – in a regular CSS file???

I don’t know of it, but I’d like it.

Take a simple example, for one area of a site without personalization: The same colors echo – either as “color” or “background-color” throughout the style sheet. If I want to change the background color from black (with white text) to white (with black text) I have to do all sorts of switches, including those for hovers, links and so on.

Isn’t there are programatic way — within the style sheet — to make these variables (“$dark = #000000; $lite = “#ffffff”)….well, variables?

Yes, I can do it through scripting, or “search and replace” …but…why?

Why can’t I change three variables in a “.css” file and have that cascade?

Interesting thought. I’ve done a bit of work with this, with personalization (either from a database or cookie) and applied these user preferences to a NON-css file (a file that can be parsed so the variables I pass to it are captured) that is then written out as a

write.

I’ve also done it so each user – upon selection of either schema or personalized choices – gets a “unique” style sheet that is then written out and included each time (this seems more efficient; also more clumsy – if cookie or db values, why not just include them into an included file? Want 12,000 “[user_id].css” files clogging up your file system??)

I have to get deeper into this…

04/16/2003 update: Yes, I know you can use JavaScript to make the changes (I do this on the “Text size” link in the menu), but — again — this is programatic, not variable-based. Still have to, for example, change every instance of “#000000” to “#ffffff” or whatever. Not just saying “$darkColor = #ffffff” and having it apply globally.

Googleperplexed

Or – let the Google-bashing begin!

As noted is some earlier entry, Google is the crosshairs of the technorati now, simply by virtue of its success.

Yes: In America, kicking someone when they are down is bad manners, but kicking someone when they are up, hey, that’s just American!

The latest GoogleFlak is the analysis Google’s of SafeSearch option by Harvard’s Ben Edelman.

Edelman’s conclusion: SafeSearch blocks thousands of innocuous sites (example: “Hardcore Visual Basic Programming”).

My reactions:

  1. GASP! I’m shocked! Stunned! Amazed!
  2. Nice catch – this might push Google to better with this
  3. WHO CARES/SO WHAT!?

In order:

GASP! I’m shocked! Stunned! Amazed!

Why should Google be any different than the other filters out there? There are companies whose entire purpose is to correctly filter out the naughty sites, and they unfailingly block sites that are useful (CDC, breast-cancer sites, and so on). Especially in an automated fashion, it’s tough.

That’s one of the reasons that the ACLU and librarians don’t want to have to install filters on library computers: So much good stuff will be blocked out, as well.

While Google is certainly positioned to do a better job than the net filters, I never really imagined that Google would do that much better (see No. 3 for more on this), at least at first.

Nice catch – this might push Google to better with this

It’s always good to have watchdogs out there – for causes I believe in, those I don’t, those I don’t care about. This group examination is a good “checks and balances” type system. Good for Ben.

It’s doubtful that any harm will come from this analysis/publicity. Yes, Google may have to work a little harder, but that will earn them some respect etc. We can all win. Good for Ben, again.

WHO CARES/SO WHAT!?

I read so much hand-wringing reaction over this “discovery,” mainly on blogs but in tech columns, as well; I just don’t understand the fuss.

Let’s look at a few facts and observations:

  • Google never promised that the SafeSearch filter was 100 percent accurate.
  • What is 100 percent accurate? I think access to information about contraceptive choices should be allowed through; you may think this is unsuitable for your child.
  • Google’s response to this study is that they try to err on the side of caution: Whether or not this is true, it seems to be a good policy. Kind of like the “innocent until proven guilty” concept. If in doubt, suppress. AND NOTE that this suppression is not
  • censorship. The user turned on the filter, and can always turn it off and resubmit.

  • You don’t have to use the filter – Unlike the debate over library filters, Google can be used in two ways: Filtered and unfiltered. Feel like you’re missing things? Turn filter off (this is the default). Getting too many naughty links for your taste? Turn the filter on. Your choice.
  • Google is not in the business of this type of filtering – the accuracy of their filter is probably not as high a priority as other projects/tools. Let’s be realistic.(Note: I’m fully aware that Google is, basically, a harvesting and filtering company, so filtering [indexing, page rank etc] is key to its operation. But not in the “naughty or nice” way — at least not currently)

I don’t know, while it’s nice that the study was done and hopefully shared with Google, I just don’t see what all the fuss is about.

It’s as though people expected Google to somehow do a perfect job of this peripheal project. Why?

And has anyone examined, say, Yahoo’s protected search and see how much better/worse it does? I read nothing about this concept in any of the articles/blogs I read.

Hey, the Google porn filter could be 100 times better than Yahoo’s (or Teoma’s etc…); it could be 100 times worse.

Let’s see some comparisons, and then we’ll have something to talk about.

========

Note: I wrote to Dave Winer about this; he forwarded my message to Ben. Both sent back nice, sometimes-not-agreeing messages to my thoughts. Excellent. I like the give-and-take; it clears the mental cobwebs.

I guess where we still have to agree to disagree is that, while Google has a bunch of really smart techies, filtering is not high on their priority list to me. Dave and Ben still hold to the “surprised Google didn’t do better” stance; I’m not. It’s not on their radar (should be; profit center…..).

Ben’s note was the most detailed; reproduced below:

Lee,

Your thinking as to the context of this work, its value, and its reception

in the community and media is generally consistent with my own.

I do disagree somewhat with your suggestion that there was no reason to

think Google might do a far better job in this field than anyone else. They

have quite a talented bunch of engineers, with impressive results in major

fields (i.e. core search quality!). They also have a huge database of

content in their cache. And I, at least, have found it difficult to get a

good sense of just what AI systems can do and what they can’t — on one

hand, they’re clearly still imperfect, but on the other hand I’m sometimes

shocked by just how good they can be. All that’s to say — I started this

project with the sense that SafeSearch might well get a clean bill of

health.

My real focus here, though, did become the “push Google to be better with

this,” as you propose in your #2. The service has been in place for three

years without, I gather, any large-scale investigation of its accuracy or

effectiveness. (And I say that with full readiness to admit that there’s

lots more I, or others, could do; I’m not sure I’d call what I’ve done so

far a “thorough” investigation, given the millions of search terms and

billions of result pages not checked.) I’m hopeful that my work will cause

Google to reevaluate some of their decisions and, perhaps most importantly,

improve their transparency and documentation as to how the system works.

As to the “who cares” reaction — there’s always the potential, in blogspace

as well as in commercial news sites, for a story to get overblown. I’m not

immediately prepared to say whether that’s what’s happening here.

Personally, I think coverage like that on http://dognews.blogspot.com/

(see the 3:31PM post of yesterday; their deep/permanent links unfortunately

aren’t working quite right at present) isn’t such a bad thing and doesn’t

make the world a worse place!

Anyway, thanks for the clear thinking here and the explicit taxonomy of the

several approaches to this project. That’s a nice and, I think, helpful way

to present the varying perspectives here.

Ben Edelman

Berkman Center for Internet & Society

Harvard Law School

http://cyber.law.harvard.edu/edelman

Not a Troll

Hey, my entry below ‘dissing mySQL was not meant as a troll.

I said:

When I started this project, I sort of gave in and had it running against mySQL — while I hate it, it is the dominant open-source database (for better or for worse…). MoveableType runs against this, and MT is used all over the Blogsphere, so whatever…

As I began coding and wanted to do stuff, however, I quickly ran out of obscenities to use for this sad excuse of a database.

Sorry, mySQL doesn’t do it for me.

To me, mySQL = Microsoft Access. Both do a lot and a lot well. For 90% of the uses out there, this is all that is needed just about all of the time.

And both databases are incredibly simple to set up (hell, I’m still screwing with an Oracle install on one of my Linux boxes. What a pain in the keester!). Postgres is a little awkward to set up (“Is the postmaster running on port 564” or whatever errors) — you have to create users, initdb and all that.

For me, mySQL was a no-brainer (perfect for me!) to install: Installed (from RPM) and it was there. Bang. Simple.

Access, of course, is just “there.”

So both Access and mySQL have merits, but not for running a high-volume, highly transactional Web site.

Yes, my opinion. But look at how often Slashdot – Perl/Mason against mySQL goes down. Daily. I can’t imagine it being the code (though the code is pretty convoluted – download the TAR and look at it. Messy!).

Ditto for the new Internet meme I wrote about recently: Blogshares.com – PHP against a mySQL database. While the traffic volume may well have played a role in it’s at-least-initial instability/slowness, I think the database was a bad choice. First of all, I think mySQL is just not hardy enough for it, and this is a site that screams out for stored procedures – which are not supported by mySQL (and still will not be in the next [4.1] release).

mySQL seems to do well with simple selects, but even this “talent” is being usurped by Postgres, at least according to a fairly impartial test of the two by Tim Perdue of phpbuilder.net.

And while my Blogging tool will probably never move off my home (behind firewall blah blah) box, while it will probably never be used for actual production of my or anyone else’s blog, it is still designed to be used by multiple users with high volume.

At least that’s my goal — so why not set the bar high?

RE: blogshares – While I do think that mySQL is not a good choice for this site (lots for reads and writes; data-integrity issues; transactions issues [that mySQL does not support] ), I fully acknowledge that it’s tough to find a host that will run Postgres for you. (To be honest, I don’t know if it’s even possible…)

The Linux hosts all come with Perl, virtually all offer PHP (at least in the upgrade package).

Databases are a different story. Usually there is mySQL and sometimes mSQL — and the database option is often an upgrade. This is changing slowly, but still, it’s rough to get a good/straightforward database hosting on Linux. (This is also true on the Windows side, with different databases: Access/MS SQL Server/FoxPro.)

So the user is pretty much stuck with mySQL (mucho better than mSQL – at least from what I read…).

So I understand the choice. I just think it’s a bad one that is going to be problematic, and — guess what? — it already is.

That said, I see the reasons that MovableType.org went with mySQL:

  • Like it or not, that’s the database you can get from a Web host. ‘Nuff said.
  • While I have serious “issues” with mySQL, it does do well in “selects only” areas. And what is a blog? ONE person makes updates (inserts), the rest is reads except for a possible comments section. Like Access, mySQL is well-suited for this.

That still does not explain why Slashdot has not converted over to either Postgres or Oracle: This is a highly-transactional site.

In addition to users clicking around to stories and comments, there are users adding comments, users meta-moderating, user being added/edited and so on.

There’s a lot of shit going on.

And – about once a day, it seems — that “lot of” hits the fan…

Blogging Tools

Just for the hell of it, I”ve decided to build a blogging tool.

You know, a blogger or MovableType type tool.

Why? Because I can. And because I’ll learn stuff doing it.

I’m building it on my Linux box in PHP against a Postgres database.

When I started this project, I sort of gave in and had it running against mySQL — while I hate it, it is the dominant open-source database (for better or for worse…). MoveableType runs against this, and MT is used all over the Blogsphere, so whatever…

As I began coding and wanted to do stuff, however, I quickly ran out of obscenities to use for this sad excuse of a database.

I rebuilt tables in Postgres and I have not looked back. Everything I’ve wanted to do is easily handled in Postgres. Damn. Nice database.

One thing I don’t particularily like – and maybe there’s a way around it – is the string concatination operator — it’s “||” (without quotes, obviously)>

To me, that’s an “or” operator — I’m used to the ampersand (&) or a period (“.”) for string concatenation. I wouldn’t mind the double pipes, except that it is so much like an OR operator. Seems weird.

Example, a value of “Mary” in the first_name column. To make it “Mary Ann” one would enter:

Update tableName set first_name = first_name || ‘ Ann’ where [some restriction…name_id = 12 or whatever]

To me, this reads “Update first name to first name OR Ann where…..but that’s just me, I guess.

On the other hand, Postgres is so Oracle-like that there might be another way to do this that is more traditional. Still, it’s a little non-intuitive to me (I tried the ampersand and dot before hitting Google and finding a solution). Update: All my searching tells me that the double pipe — || — is the only string concatenation operator. Oh well. Can’t say I’m thrilled with that, but what the hell…

Note: mySQL’s way of doing this is to use the word fragment “concat” — now that’s WAY weird, to me…

Blogshares — Update

Wish this were not the case – and wish I did not predict it previously – but:

I seriously need to investigate some more reliable hosting with better capacity and on a different server to my mail server

(from Bloghshares.com, ~5pm CST)

The owner/publisher (what DO we call them? Not “webmaster”, not now…) also said the following: “BlogShares is going to be huge (honest!)” – and I agree.

For how long?

Internet meme, remember…

Blogshares – The Price of Popularity

While the addictive and increasingly popular blogshares site is not really suffering from a true slashdot effect (a spike in traffic due to publicity of some sort), it does appear to be groaning under the weight of its popularity.

Over the last day, a few hits have come up with the following results:

  • Server down but box still pingable
  • mySQL “too many connections” error (another Slashdot legacy … mySQL trying to run large sites….)
  • Server down/box unpingable (current state, as I type this)

While I think the site is great, in the “Internet meme” sort of way (remember “hotornot.com”?), it’s suffering from two main problems:

  • It basically seems to be a labor of love, and — while well done — doesn’t have the hardware it needs behind it. Could be unoptimized queries and so on, as well. Tough to do a perfect job with one person (or small group).
  • It’s getting way popular. This site is going to have to change its hosting option if the traffic keeps up.

But always cool to see a new concept out there, and this one really is well done, both from the concept through the details (how to pick “shares” and so on) through the look and feel. Outside of stability and some little things I find awkward or just plain problematic, it rocks. I wish I could do something as complex as well.

Hell, I wish I could do something trivial as well.