| PLEX86 | ||
Statistical data about first years of usenet trafficPlease pardon this intrusion. I've been ego-grepping on Google recently, mainly to see whether I'm remembered on a group I dropped some while back. (Also because it's much the most efficient way to find threads that interest me via Google, sigh.) Anyway, there are several things in the post I'm following up to that may be useful to the original poster, or to anyone who finds this thread in the archives years from now for that matter. So here goes. I'm not pretending that all of this very long post will interest the average reader of this group; sorry!
1403 printers was: IBM's last tabulator last unitrecord punch card machine John Savard of basically and At the university I was at, we had two 1403-N1 printers which were... Studying the early years of Usenet is going to be *hard*. Unfortunately, studying the somewhat *less* early years of Usenet is going to be *even harder*. You probably need to know, first of all, about how Google's archive was built, because unless there's been a revolution in this area and nobody told me, Google's archive is *by far* the most important source you're going to have, or at least, where the 1980s are concerned, is based on that source. IBM's last tabulator last unitrecord punch card machine I was wondering when IBM manufactured its last EAM-tab-unit record punch card machine. My guess is the early 1960s. Before 1962, IBM derived the bulk of its revenues... So you need to find the post in which Google explained how its archive had been built. Let me see if I can find that... OK. Seems Google's addressing has changed, which is definitely a pain, sigh. But anyway, here's the main thing you need: Depending on what my news server and your news reader do, you may need to re-wrap that. Anyway, the message-ID you want is You may also want to find various other materials explaining some of the details of the archives. Henry Spencer posted a fairly long explanation of how the Toronto archives, the main source for the 1980s, were built. You can presumably find this by digging on Google; be sure to *limit your search* to dates later than November 30, 2001 so you don't have to wade through everything Henry Spencer ever posted (there's a lot of it). David Wiseman, who was heavily involved with the *retrieval* of the Toronto archives, put on his website a cross between an explanation and a reminiscence of that retrieval. If you want some idea why other old archives aren't being retrieved, it may be worth reading that. In any event, it's further information about your main primary sources, and for that reason you kind of have to read it. Anyway, there's another reason to look up his website. The way I usually remember his URL is to go to Google's timeline. The timeline is not especially valuable in its own right (in fact, the first entry in it lies, which is not promising!), but here's its URL: Anyway. Each of the people credited on the timeline got a link. The uses they put these links to varied, um, considerably, but David Wiseman's plonks you right into the part of his website you're likely to care about if you're reading the timeline in the first place. Sigh, no, I guess it used to. Anyway, though: Now, here's the reason you *have to care*. There are a *lot* of problems with relying on Google for serious historical work, and this is especially true for the early 1980s, which appear to be the period of concern to you. The reason it's especially true for that period is that the various things Google did to headers to make them work with its database are most violent for that period. (Except for the period where its archive comes from DejaNews, which *really* violated headers, but anyway.) It had to mess with Message-IDs, it had various problems that lead, for example, to one of the most historically important posts in the whole schmear being buried inside a bad joke post, and above all, it made bad decisions about how to buttign dates to posts. The dates presented by posts in "original format" are usually the dates they were posted with, but the dates Google *tells* you a post was posted are just seriously wrong. This has been discussed any number of times on Usenet; my most substantial report on the issue was years ago, and I hope people have figured out more since, but you'll find it if you follow one of the search strategies I recommend later on. Anyway, the point is that of the archives Google used, you can probably build your own for the entire period up until DejaNews. No joke. But I don't know what it would take to get Kent Landfield and Jurgen Christoffel to help you out, and I *do* know that David Wiseman has been willing to help people out. So that's why you have to go to his website. You have to read what he has to say about research access to the Toronto archive, and then you have to do that. 1403 printers 480 the 6670-etc were ibm copier3 with computer interface to drive them. they could be used out in deparmental areas, local stock rooms, etc. among other... *ALL* of that said. If you do what I'm recommending here, you'll find this out on your own, but if I'm misunderstanding how important this project is, maybe this will be too much work, so the short answer is that the Toronto archives started out incomplete and got progressively more so. For example, a group which many people incorrectly claim was one of the first three groups, net.sources, doesn't have any archived posts until *months* after the archive becomes comprehensive at May 11, 1981. Henry Spencer didn't actually *intend* to archive net.sources at all, I think; or maybe he did but Google didn't; regardless, the archival of sources groups is spotty throughout their history. Ditto binaries groups, if you care (they're a much later invention, so if you're sticking with your chosen period, you may not.) Anyway, when I said way up there that the late 1980s would be harder to research than the early 1980s, what I meant was that Toronto got pickier about what it kept as Usenet expanded. The Toronto archives were built for a computer science department. So many of the newsgroups where people engaged in relatively frivolous discussion are mbuttively incomplete. Unfortunately, this includes some of the important *social* milestones of Usenet's history. For an externally famous example, the Usenet-organised protests that led the Coca-Cola Company to phase out its "New Coke" in something like 1984 cannot be traced in full. For some internally famous examples, there is strong reason to think we're missing much of the stuff behind the creation of the first FAQ, and don't get me started about how much we're missing when it comes to the creation of newsgroups. I'm bringing these latter examples up in order to emphasise that a vaguely computer-related topic is *not* necessarily going to have protected a newsgroup from Toronto's selectiveness. 1403 printers was: IBM's last tabulator last unitrecord punch card machine 479 Swapping My first employer used a 48 char 1403 which handled the 1401 character set. However, we were under... My conclusion is that if you try to produce statistical data on the basis of the Toronto archive, either the original or Google's version, you are going to end up with amazingly bad data. I do have advice for you below about ways to deal with this, but for now I'm mainly pointing out that this will be true more or less unpredictably. At best, you may be able to work out which exact newsgroups Toronto was keeping religiously and study those. (At some point in the 1980s you may start seeing Toronto numbering the posts by newsgroup, which would help enormously with this, but I don't remember the history of the Xref header, so am not sure.) No, Dave Wiseman is the guy he needs most. Also, Bruce Jones left UCSD years ago. I don't know (and don't have time to check in today's web surfing for this post) how much, if any, of his archives there are still in place; the A News archive disappeared about the time he left, but the extremely important UsenetHist archive may still be there (it was in early 2002, anyway). I wrote to him when I started the series of posts mentioned below; his reply was basically to tell me how amusing he found it to revisit the newsgroup I was posting in, news.groups. He may help someone in school more, I dunno, but don't get your hopes up too high. I have since revised 1986, and covered 1987-1993, along with a draft version of the 1994 post. Unfortunately, I have *not* corrected some egregious errors in the 1980-81 post. To a limited extent these are compensated for by a subsequent post I made, but only to that limited extent. Um. Not that this tells the OP anything, so here goes some explanation. When the archive was announced, I was thrilled. I believe I'm *still* the last on-topic poster to news.announce.important, and this was the spur to that. Anyway, I quickly started trying to get a handle on the history of newsgroup creation, given that I spent much of my time from late 1995 to mid-2004 working on newsgroup creation on the group news.groups (and sometimes elsewhere). So I started with a (now badly outdated) account of the history of newsgroup-listing, and then launched into a mbuttive research project. Essentially, the first phase of this project, and the only phase so far begun, is an attempt to get everything I reasonably can out of lists of newsgroups as sources for the history of newsgroup creation in particular hierarchies, these being: NET.*, fa.*, net.*, and mod.* (1980-1987) and talk.*, misc.*, rec.*, soc.*, news.*, sci.*, comp.*, and humanities.* (from 1986 on). In other words, I've basically been treating these lists as a data source, and both describing their peculiarities and mining them for data. My main products in doing this research are two series of posts. One is "year-summaries", and these are what now cover 1980 through 1994 (draft only). The other is "hierarchy-summaries", and these exist only for the first four hierarchies listed above. Lucky you, if you're really interested in Usenet's early years, they're what you care about. Regardless, I have no reason to think that my specific interest in newsgroup creation is in fact yours. You may find my work useful for establishing when various newsgroups existed. However, be warned that official recognition in lists is what I'm studying, and that can at times be *radically* different from real existence: lots of groups existed without being listed, and some were listed without being real groups. You are much better off mining the archives directly (for this purpose, Google can be helpful) rather than relying on the lists. So what you really need to do is mine the spin-off and the unrelated work. I'll get to those in a moment, but first of all, you need to know that Google, in its current incarnation, is mangling Message-IDs in the *body of posts*. As a result, you will not be able to find the posts I *refer to* in *my* posts. Which severely limits their usefulness as sources! Luckily, my posts in these series are all archived at my website: Be sure to read also the opening paragraphs of my front page: The post that corrects some (but not all) of the 1980-81 post is I do try to notify various places when my websites move, but I'm not always good about it. The best way to find my site at any given time, if the link you're following doesn't work, is to find a recent post by me and look at my .sig. Anyway. On to the spin-off work and the unrelated work. I don't know how to find unrelated work. In my posts, I've cited what I found when I started out, but in general otherwise I've confined myself to noting what I find with my various search strategies. This means that if someone is, for example, doing a project that entails posting a list of obscure newsgroups, I'm likely to find it; but if they're studying social dynamics in net.lang.c, I'm not. My point is to insist that you'll often find secondary work valuable. You will not be able to construct useful measurements of traffic in the 1980s from the Toronto archive. But people were *posting* traffic analyses *at the time*, and you will be able to use those, though for really careful scholarship, you'll need to find out what you can about how reliable they are. (How good was propagation to the sites reporting their traffic? Which groups did they carry? Etc.) In particular, if it turns out that your period overlaps the beginning of something called "Arbitron", you've hit pay dirt; you should really really use Arbitron, and you should study carefully all the info you can find about its flaws. Anyway, spin-off. Google's announcement produced a surge of posts to alt.fan.dejanews that you might find useful materials in; in particular, there was a sort of FAQ on how Google's URLs were built, though unfortunately that seems now to be outdated. Posts in news.groups are the other area I'm familiar with, and these were spin-offs as much of my work as of Google's announcement. Most of these, and certainly most of the ones done late enough that their information is likely to be even approximately reliable, will have the tag History in the subject line; use this for searches. I think you will find much of this material valuable, very possibly more valuable than the year-summaries and hierarchy-summaries; I recommend in particular the thread "Early Mailing Lists", both because it's got a bunch of additional references in it (though here, thanks to Google's mangling of Message-IDs, some of these will now be hard to follow), and because it touches in some detail on the shortcomings of other archives of 1980s online material besides Toronto's. 1403 printers was: IBM's last tabulator last unitrecord punch card machine 482 Yep, 6670 output was beautiful, but still somewhat limited. The 6670 was still basically a character printer; it could print characters from a limited number of internal fonts but couldn't do graphics or arbitrary, downloadable... You should also, if you want to be really thorough, research the issue of what happened before May 11, 1981. The correction post I pointed you to above is my most recent full statement about this, and it's not terribly clear; if you want me to try again, ask. Definitely don't take what the 1980-81 post at my website says as gospel. Anyway, actual surviving pre-1981 materials are discussed in a succession of posts on news.groups and alt.fan.dejanews, essentially the same posts that talk about the flaws in Google's presentation dates. (Come to think of it, I don't actually know that Google's dates are still flawed. Hmmm. Something else to look into.) One of my posts, available at my website, includes an archive of 1980-81 posts preserved outside Toronto (the only such archive known to me); I also have at my website a copy of a version of A News. You will find some, but I think not all, of the relevant material, when you search news.groups for the subject tag History. Good luck. Please feel free to write me if you want additional help from me, but note that I do *not* have home net access at present. That's supposed to change, in fact to change this week, but it's not guaranteed to; and if it doesn't change, I'll only be checking e-mail between 2 and 4 times per week, on average. Also, be sure to use a very clear subject line (ideally, write in the form of a reply to this post) because I get a lot of spam and weed mostly on the basis of subject line. Joe Bernstein -- --
|
||||
IBM's last tabulator last unitrecord punch card machine Alt Folklore Computers from Newsgroups The #1 Usenet Provider on the Internet
|
||||