PKGW

DASCH Data are Now Queryable In VO Tools

Thu, 02 May 2024 14:00:46 -0400

Earlier this week, Markus Demleitner</a> let me know that he launched a project whose development we’ve been discussing for the past six months or so: DASCH</a> data are now available in the Virtual Observatory</a>!

If you’re familiar with DASCH’s legacy “Cannon” data services</a>, you might be asking: “hasn’t DASCH had VO support for a while?” Yes, DASCH has had some limited support: basically, you could download certain data tables in the VOTable</a> format as well as more basic plain-ASCII formats. If you manually navigate to the DASCH website, do a query, and set the right options, you can download a table in a VO format and then load it into a tool that recognizes that format, like TOPCAT</a>.

You could call this “VO support” in a certain sense, but it’s a far cry from the vision that motivates the VO effort. In the current words of the IVOA About page</a>, “The goal of the Virtual Observatory is to achieve the [feeling that astronomical data are] all available to explore in a single transparent system.” The 2001 US VO whitepaper, “Toward a National Virtual Observatory”</a>, descibes a data access layer that “provides a uniform interface to all data and services within the NVO” (emphasis original). My read of the original VO documents is that the vision here was quite literal — there would be one VO portal where you would go to get “the data”, for any astronomical data in the world.

That hasn’t quite happened, but we do have an ecosystem of services and tools that use VO standards for data discovery, and DASCH has not been plugged into that ecosystem at all. Until now.

Markus has taken an export of the DASCH plate database and made it available</a> as an ObsCore</a> table, allowing the list of DASCH images to be queried using standard tools, and a SIAP2</a> service, providing a mechanism for people to actually download the images described in the ObsCore table. (That service in turn builds on the new StarGlass</a> backend API.) His blog post on the GAVO site</a> very nicely demonstrates how these building blocks can be integrated into existing VO tools to suddenly make DASCH data available and discoverable in ways that simply weren’t possible before.

I’m excited for this service to become available, and hopeful that it will help make DASCH data more widely available to astronomers. That being said, I also expect that we’ll continue developing non-VO data access APIs based on the StarGlass platform, and hopefully customized support in astroquery</a>. In my experience, any non-trivial dataset always has facets that aren’t captured well by cross-cutting standards like ObsCore, and you find that power users of such a dataset virtually always benefit from a query interface that’s really specifically targeted at it. That’s especially true for DASCH, whose data are not only quite unusual, but also potentially interesting to people coming from all sorts of disciplines.

We’ve already seen a first example of how one-size-fits-all can fall down: Markus has gotten some complaints because Aladin</a> positional searches for images started yielding numerous hits for DASCH “meteor” plates, which are shallow, low-resolution images covering absolutely huge areas of the sky — dozens of degrees on a side, if I recall correctly. Not only do such large images cause problems for the Aladin UI, which was probably designed for images more like arcminutes on a side, but they’re also unlikely to be useful scientifically for most users either. Some of these plates were made with “telescopes” that consisted of literally three-inch camera lenses, and they have horrible vignetting, in addition to their terrible depth, resolution, and cosmetic defects. It may be true that if you pick any point in the northern-hemisphere sky, you’ll find that hundreds of DASCH meteor plates overlap it; but there are probably only a handful of people in the world who will ever want to look at them. We’ll probably restrict the default DASCH VO database to only include the narrow-field plates, which are more likely to be useful.

I think this issue nicely demonstrates the kind of challenges that arise when you take on a mission like that of the VO. The meteor plates can be perfectly well-described in the ObsCore format, and the query results being returned by Aladin are presumably perfectly correct — but it turns out when someone says that they want to see “all available data”, they really mean “all available data, but, like, obviously not including that”. Note that while there are technical elements to this problem, it’s not purely technical — it’s got design elements and potentially political ones as well. (What if I angrily insisted to Markus that every DASCH plate should turn up in every image cone search that intersects it, even if that means swamping other results in hundreds of meteor plates?) And the more you optimize for the common “well obviously I didn’t want that” case, the more you run the risk of losing out on the serendipitous discovery that many people clearly hope that the VO will promote. It’s a tricky balancing act!

A Modest Proposal for How to Do Digital Conference “Posters”

Thu, 25 Apr 2024 12:02:13 -0400

Apropos of nothing in particular (really), here is how conference organizers should handle “posters” when they need to be digital, either because the conference has a virtual component or there’s no standard poster hall. 
The Proposal 🔗</a></h1>

The digital equivalent of a traditional conference poster shall be called a “six-pack”.</li>
It shall consist of a slide deck containing no more than six slides.</li>
There are no other rules.</li> </ol>
Discussion 🔗</a></h1>
The “conference poster” form factor for scientific communication is not a good one. How many posters have you seen that are hard to navigate at a high level and impossible to read in detail? Some of this could be called “operator error”: certain posters are designed with a lot more care than others. But I generally find that even the best-designed posters are a bit dissatisfying, and I think is indicative of some basic difficulties imposed by the form factor. Posters are too large; printed text almost always comes in pages about the size of your face because that’s what matches the human field-of-view at normal reading distances.
This is not to imply that the time-honored in-person poster session isn’t useful. And if you’re going to have a traditional in-person poster session, the posters are definitely important — there’s a reason we don’t have people just standing around in assigned spots with nothing to point at. But on the other hand, if we’re seeking to extend the poster-session experience into the digital realm, there’s no particular reason to cling onto the poster form factor</a>. Skeumorphic designs for digital artifacts are bad</a>, usually.
So, what are the functions being fulfilled by a traditional printed poster? Loosely speaking, it’s tangible: a visitor can benefit from it even when you’re not physically standing next to it. And it’s accessible, in the intellectual sense: it should help someone quickly get the gist of the work you’re presenting, even if they’re not especially familiar with its context.
I’ll assert without proof that in the digital-native realm, when one wants to fulfill these functions, there’s one form factor that clearly reigns supreme: the slide deck. Sometimes I feel like this fact ought to bother me, but it doesn’t, really, nor can I figure out why it should. There are certainly times when people give the PowerPoint treatment to a complex topic that really should be worked out carefully, but that’s not PowerPoint’s fault; and there are surely just as many cases where complex topics deserve accessible, high-level summaries! A poster session is exactly one such case.
(Drive-by hot take: literate programming</a> as implemented by Knuth is basically a way to PowerPoint-ify source code.)
This sets the stage for the six-pack idea. A slide deck is the right, digital-native form factor for delivering an accessible, tangible summary of a research project. But, as anyone who’s been to a large conference knows, some people just can’t help themselves: tell them to prepare a poster, and they’ll give you wall-to-wall text in ten-point font. You can’t completely prevent this, but setting a limit on the number of slides is an easy and extremely concrete way to enforce some level of compactness.
Why six slides? Partially because it seems like about the right number, but mainly because then you can call the slide deck a “six-pack”, which is (hopefully?) catchy.
You could try to also set rules about font sizes or slide sizes (some people use PowerPoint to prepare full-size posters — it can work well as a page-layout tool), but it doesn’t seem worthwhile. If someone wants to go overboard, they’ll find a way, and more and more I feel that it’s important to lean into the ephemerality of posters or six-packs: they simply aren’t important enough to be worth the effort to rigorously prevent that. Likewise, I don’t see a need to try to set down rules about ”the first slide shall be a title slide,” or whatever — most people will do something sensible, and in this case it’s better to have a simple set of official rules than to try to legislate common sense.
What I would try to legislate, if I were running a conference that involved “poster sessions” with six-packs, would be the choice of file formats. I’m fairly optimistic that we’ll collectively have the capability of viewing PowerPoint and Keynote files from now until the end of time, but not everyone “attending” a digital conference is going to be able to open them. Six-packs should be provided in PDF format for long-term archiving, and either PDF or HTML for cross-platform viewing. I love HTML slide decks</a>, and I think they’re a perfect match for a “virtual poster session” scenario, but I don’t think it’s realistic to expect most people to be able to author them, just yet.

A Tool and Workflow for Radio Astronomical “Peeling”

Wed, 17 Apr 2024 15:20:26 -0400

With last Monday’s North American total solar eclipse throwing a lot of things off (ask me about my nine-hour drive from Vermont to Boston!), this is a good week for catching up on the blog backlog. So, nearly five years after I published it, here’s a quick advertisement for a radio interferometric peeling tool (code</a>, publication</a>) that I’ve developed. This post will describe the associated workflow in a bit more detail than could fit in the length-constrained Research Note (Williams et al., 2019</a>) that presented the tool. 
Fair warning: I’m going to get right into the weeds since this is a specialist topic.
When you’re doing interferometric imaging with a telescope like the VLA</a>, very bright sources are often a problem. In interferometery, calibration errors generally scatter flux from true sources across your image, potentially corrupting the measurements of whatever features you’re interested in. People interested in interferometric techniques therefore often spend a lot of time worrying about the dynamic range that they’re achieving, measured as the ratio of the brightest part of the image to some kind of noise metric (e.g., Braun 2013</a>). Standard calibration techniques should easily achieve a dynamic range much better than 100:1 (e.g., 1% scattering), and if that’s good enough, fine. But if you have a bright source in your field, you might need to use more sophisticated calibration to reach for 10000:1 or even better, because 1% of a big number might be enough to cause problems for your science.
Standard interferometric calibration techniques are “direction-independent”: they compute various parameters that are, in a certain sense, based on the total flux arriving at the telescope from the field that it's looking at, without paying attention to where exactly the flux is coming from. To get more sophisticated, often one adopts “direction-dependent” calibrations: your antennas aren’t perfect, and so the telescope’s response to flux from one part of the sky isn’t the same as its response to the same amount of flux from another part of the sky. Accounting for this reality requires a more complex instrument model and extra computation, but can yield significantly better results.
In the simplest situation, after solving for a direction-independent (DI) calibration, you might discover that a nuisance source dominates your image. If it’s bright enough and sufficiently easy to model, you should be able to self-calibrate to obtain a direction-dependent (DD) calibration optimized for that one particular source. If that source is off in the sidelobes of your antennas, those calibration parameters might be very different than the DI calibration parameters, which are generally derived by pointing your telescope right at a calibrator.
In this scenario, the emission from your science target(s) is still best calibrated using the DI parameters. So, you basically want to analyze the data using two different calibrations at the same time — but how? Some radio imaging tools can do this, but if we’re not using one of them, we’re not necessarily out of luck. In the “peeling” technique (e.g. Noordam 2004</a>), we simply subtract the source away in the visibility domain. To the extent that our system is linear, and our calibrations are invertible, and that we can model our source (hopefully all pretty good assumptions), this will get rid of the source and allow us to proceed with standard DI analysis as if it was never even there.
For the Allers+ 2020</a> windspeeds paper, this was what I wanted to do. But as far as I was able to tell (and to my knowledge, this is still true), the standard CASA</a> analysis package didn’t provide all of the tools necessary to peel. Most of the elements were there, but you need some mechanism to, essentially, invert a set of calibration gains, and I couldn't see any way to do that with the out-of-the-box tasks. Meanwhile, I couldn't find any third-party peeling tools that looked like they could plug into an otherwise fairly vanilla VLA cm-wavelength analysis.
But the underlying computation isn’t that complex, and it can be implemented if you’re willing and able to write some code to edit your Measurement Sets</a> directly. You could do it in Python, but I had been experimenting with building an interface between the CASA data formats and the Rust language, a project called rubbl</a>. For the large, complex data sets that you get in radio interferometry, Rust is worth it — I’ve implemented data-processing steps that become 10 times faster after porting from Python to Rust, and that’s not even factoring in the way that you can implement parallel algorithms in Rust in a way that Python just can’t handle. (If you’re using python-casacore</a> naively, at least; systems like dask-ms</a> might improve things.)
So I ended up implementing the missing step inside a toolkit called rubbl-rxpackage</a>, in a command-line tool called rubbl rxpackage peel</code>. (There are a couple of other mini-tasks inside rubbl-rxpackage</code>, but not many.)
The rxpackage peel</code> tool implements only the very specific calibration-inversion step that I found to be missing from mainline CASA. To actually implement peeling in a pipeline, you need to include the tool within a broader set of steps in your workflow. Theassociated Research Note</a> describes this workflow, but length limitations forced it to be quite terse. Below I reproduce the description in the note in an expanded format.
The Peeling Workflow🔗</a></h2> Assume that we have two Measurement Sets, main.ms</code> and work.ms</code>. The fundamental operation implemented by the peel</code> command is to perform the following update: main.ms[MODEL_DATA] += work.ms[MODEL_DATA] * (work.ms[DATA] / work.ms[CORRECTED_DATA]) </code></pre> Basic calibrations are multiplicative: CORRECTED_DATA = (calibration) * DATA</code>. So the ratio DATA / CORRECTED_DATA</code> is the inverse calibration term that we need, and the effect of the command is to update a MS model with another model perturbed by an inverted calibration. By computing the inversion this way (as opposed to, say, futzing with gain calibration tables directly, which was the first approach I considered), we get the nice property that we can invert any multiplicative calibration, without worrying about how exactly it was derived. With this building block, we can peel. We’ll assume that we have n > 0 bright nuisance sources to peel, numbered sequentially with an index i by decreasing total flux. The steps are as follows. Image main.ms[DATA]</code>, obtaining a CLEAN component image main.model</code>. (Or multiple model images, if you’re using MFS with Taylor terms, etc.). So, we’re assuming that you’ve used a task like split</code> to apply your DI calibration to your dataset. In the abstract this isn’t necessary, but CASA’s triple-data-column paradigm kind of forces you into this.</li> For each bright source to peel ... Create a CLEAN component image template.$i.model</code>, where $i</code> is the index number of the bright source in question. In this component image, zero out the sources with index numbers j ≤ i. That is, edit the actual component model image data to replace the pixel values around the bright source and the already-peeled ones with zeros. This requires some custom Python code, but it's straightforward and not I/O-intensive.</li> Use the task ft</code> to fill main.ms[MODEL_DATA]</code> with the Fourier transform of template.$i.model</code>. This set of model visibilities therefore captures the non-nuisance sources, and any bright sources that we haven’t yet started dealing with.</li> For each previously peeled source with index j < i, use the peeling tool with its work MS (see blow). This will update main.ms[MODEL_DATA]</code> to add in the best-available DD-calibrated models for these sources. Once this is done, the model will capture everything in the image except the ith nuisance source, because we zeroed it out when creating template.$i.model</code>. We also zeroed out the j < i sources, but now we’ve added them back in.</li> Clear main.ms[CORRECTED_DATA]</code> and use the task uvsub</code>, which will set CORRECTED_DATA = MODEL_DATA - DATA</code>. This will leave the CORRECTED_DATA</code> column containing only source i — the things that are not in the model are the ones that remain after we subtract the model. Of course, this only holds to the extent that our calibrations and models are accurate, but if we’re focusing on mitigating bright nuisance sources, the imperfections shouldn’t be significant, by definition.</li> Use the task split</code> to create a new dataset work.$i.ms</code>. Its DATA</code> column will be equal to this CORRECTED_DATA</code> column, so it will contain only the signal from source i. This signal has had the DI calibration applied, but still could benefit from additional DD calibration.</li> Fill work.$i.ms[MODEL_DATA]</code> with a model of source i, using whatever standard CASA tools are appropriate. E.g., you might use the “component list” functionality and the ft</code> task.</li> Use CASA’s standard calibration routines to determine a source-specific self-calibration for source i. You can use whatever calibrations are appropriate, so long as their net result is multiplicative on a per-visibility basis. You don’t have to worry about absolute flux calibration since we only care about removing the source in the “uncalibrated frame”. We now have the DD calibration for our source, since we subtracted all of the flux for everything that is not our source.</li> Use task applycal</code> to, well, apply the calibrations, filling in work.$i.ms[CORRECTED_DATA]</code>. The only reason we need to do this is so that the peeling tool can then figure out the inverse calibration by computing the ratio DATA / CORRECTED_DATA</code>. If there were a way to directly invert the calibration parameters obtained in the previous step, we could skip this step, but I would only feel confident doing so with unrealistically trivial calibrations.</li> </ol> </li> After doing the above, we have i copies of our dataset, stored in the work.$i.ms</code>, each of which encodes the DD calibration solution associated with its corresponding bright nuisance source. Clear MODEL_DATA</code> and CORRECTED_DATA</code> in main.ms</code>.</li> Now, for each bright source, use the peeling tool. This will build up the MODEL_DATA</code> column of main.ms</code> to contain only the DD-calibrated models of the bright nuisance sources. Note that, if the calibrations are multiplicative as required, the actual data “disappear” in the ratio of DATA / CORRECTED_DATA</code>, so there is no danger of faint signals making their way into this model (which would be bad because they’re about to be subtracted from the science data). Put another way, this column will be noise-free. It contains a sum of ratios of analytically-derived quantities: source models (the MODEL_DATA</code> in the work.$i.ms</code>) divided by instrumental (calibration) models (more precisely, multiplied by the reciprocal DATA / CORRECTED_DATA</code>).</li> Finally, we can use uvsub</code> to subtract this model from the science data. main.ms[CORRECTED_DATA]</code> will now contain the science data with the bright sources peeled. You can then proceed to image, selfcal, etc.</li> </ol> Does it work? It sure does! The results were actually better than I had dared hope. Demonstration of the workflow with a 10 hr VLA observation. The left panel shows the image before peeling, the right after. Color scales are the same. The circles indicate the half-power response of the VLA primary beam at the central observing frequency, 6 GHz. The CLEAN residual rms at the pointing center decreases from 3.9 to 2.5 μJy. From Williams et al., 2019.</figcaption> </figure> </div> I had probably been wishing for this functionality for, say, five years, before I sat down and implemented it. Rust helped, but really the breakthrough was realizing that I could invert the calibration gains by computing DATA / CORRECTED_DATA</code>. Before that (in retrospect, obvious) idea came to me, I was getting hung up on how to invert a CASA gains table — you can take the reciprocals of all of the numbers in the table, but how does that interact with time-based interpolation? What if you also want to apply a bandpass correction? It just seemed like it was going to be really dodgy. The “empirical” approach isn’t as efficient, but it’s robust, and that’s a tradeoff I’m generally happy to make.
The xz Backdoor and Release Automation Thu, 04 Apr 2024 10:37:08 -0400 Some of you may have spent a lot of the past week following the drama ensuing from the discovery of a backdoor inserted into the open-source xz</code> library</a>. Others of you probably have no idea what I’m talking about. There’s been more than enough chin-stroking on this topic over the past week, but I want to at least point out a connection to a topic near and dear to my software-engineering heart: release automation. The short version of the xz</code> story is that, more or less by luck, an engineer discovered that someone had subtly inserted a malicious backdoor</a> into a low-level compression library called xz</code>. The backdoor was only introduced a few months ago, so it didn’t really have time to spread much around the internet — but it was designed in a way such that if it had, it could plausibly have given the perpetrator a nearly-invisible way to hack into huge numbers of computers all across the internet. Yikes! The scary thing is that the “someone” who inserted the backdoor was the nominal maintainer of the package — the person in charge of it. This “person”, operating under the name Jia Tan, almost certainly doesn't actually exist. And they weren’t the original maintainer of the package. With hindsight, it appears that xz</code> was subject to a multi-year, state-sponsored social engineering attack. The original, non-nefarious developer posted in 2021 about being too burned out to maintain the library; almost immediately, “Jia Tan” appeared and started helping out. Eventually “Jia Tan” was given maintainership rights, and finally, after a few years, attempted to sneak in their malicious code. I’m no security expert, but I agree with most of the armchair folks out there that between the long-time horizon of the attack and the complexity of its implementation, it seems overwhelmingly likely that it was supported by an organization of national scale. As a side note, you might recall that I posted about seeking new Tectonic co-maintainers</a> and eventually added two people</a>. I remain very happy that I did! They were both previous contributors and during that process I got some information about their offline identities; I can’t say that there’s zero chance that I’m not the subject of a sophisticated intelligence op, but I’m not losing sleep over it and I don’t think that you should either. (Also, hindsight further suggests that first step of the social-engineering campaign against the original xz</code> maintainer was targeted bullying about his failure to be sufficiently responsive to feature requests; although it would be hard to distinguish this from everyday life as an open-source maintainer! Either way, that’s thankfully not something that I’ve experienced with Tectonic.) Anyway. The discovery of this attack has prompted a lot of musing about the open-source ecosystem; this toot</a> from Mikey Dickerson (or is it “Mikey Dickerson”?) put it well: hey does anybody out there have any thoughts about the xz compromise or perhaps have you thought of a way to relate it to some axe you have been grinding for 20 years </blockquote> I sure do, and I sure have! But here I’m going to keep it to one specific axe that’s only a few years old. For something like the xz</code> attack to succeed, the malicous code has to be well-hidden. People use libraries without reading their source code all of the time, but if your repository contains a big function called trojan_horse_implementation</code>, someone is eventually going to look into it. You also only want to enable the code in narrow circumstances, so that routine security tests don’t even see that it’s there. A lot of the sophistication in the xz</code> attack is how the malicious payload was obfuscated and disabled altogether when it wasn’t needed. One component of this was that the code that developers download and install from GitHub is unproblematic; a single, innocuous-looking line inserted into one of the build scripts in the packaged release files starts up the whole machinery that activates the hack. This has gotten some people worried, and rightly so. Generally, developers use the GitHub version of a piece of software, but most deployed versions (i.e., the stuff that ends up installed on thousands or millions of computers) are based on release packages created by the maintainer. The release package is supposed to capture what’s on GitHub, but usually the maintainer creates it by hand, so really it could contain … anything. If I offer a compiled version of a program, for instance, I could compile in whatever extra code that I want, and the nature of the packaging and compilation process makes it effectively impossible to verify that the “release artifacts” truly correspond to the public version of the source code found on GitHub. This system requires you to completely trust the maintainer who creates the releases — which is now feeling like a scarier proposition than it used to! How can we mitigate this? Well, I claim if we want to be able to verify that release artifacts truly correspond to their alleged source code, we need to automate the processes by which they are produced. And would you know that I’ve been excited about release automation for several years now</a>? Once you ask the question “why are maintainers creating these things by hand, anyway?”, I feel that the proper solution becomes self-evident. That being said … why are maintainers creating these things by hand? In my view, a big reason is simply inertia. The counter-question — “what’s the alternative to making a release by hand?” — only has an answer thanks to the existence of publicly-available, cloud-based CI/CD</a> (“continuous integration and deployment”) services, and I think that a lot of projects still really haven’t internalized the kinds of workflows that these services unlock. It’s been interesting to watch the evolution in this space. When I started using Travis CI, it was basically a way to trigger a VM to run test suites for various programming languages. But at some point we collectively realized that if you can do that, you can really run any kind of code — these services are really free, easy-to-configure platforms for cloud compute on demand, that just happen to have their workflows tied to Git repositories. (I have no idea how these services prevent people from abusing them to mine crypto; I’d guess that they have whole teams dedicated to stopping just that.) Conda-forge</a> was quite ahead of its time in realizing that you could use CI/CD to build and publish software packages, but that sort of insight hasn’t sunk in everywhere. The other piece is that there are some genuine workflow problems that need to be solved in order for maintainers to create high-quality release artifacts on these public CI/CD systems. Admittedly, lot of maintainers have been perfectly happy with the status quo, but I feel that there are some subtle issues at the root of this activity that need to be addressed carefully. This problem bothered me for years before I first sketched out a solution that felt adequate</a>, and then I had to create a whole new tool</a> and workflow</a> to implement it. While the ideas behind Cranko</a> are, I think, quite general, it’s also true that its approach also benefits greatly from the sophistication of modern CI/CD systems; its release automation would be a lot more annoying without Azure Pipelines’</a> tools for accumulating artifacts and managing multi-stage builds across the Linux, Windows, and Mac platforms. That is to say, there are a lot of pieces of infrastructure that need to come together to make high-quality release automation feasible. Fortunately, those pieces currently exist. Release automation on public CI/CD is far from an airtight solution, of course — a malicious maintainer can insert obfuscated build steps, use hidden environmental settings, or simply replace automatically-generated release artifacts with tampered ones after the fact. But it at least enhances our ability to audit release artifacts and understand how they’re produced. I didn’t have the supply-chain security angle in mind when I developed Cranko</a>, but I wouldn‘t be surprised if people start adding release automation to the list of security-enhancing practices that they want to see open-source projects adopt. DASCH Scanning is Complete — What’s Next? Thu, 28 Mar 2024 02:00:00 -0400 Today’s the day! After two decades of work, the DASCH scanning “prime mission” is now complete. Today we celebrated the scanning of the final DASCH-able plates in the HCO collection with a small event and a champagne toast. This is the end of a chapter, but far from the end of the story. Just how much work is left to do is a question whose answer depends on how broad of a perspective you want to take. When it comes to making the existing DASCH data available and useful to the astronomical research community, there’s still plenty to be done. In the near future, that work will mostly take the form of the next data release, DR7, currently under development as DRnext</a>. As my recent posts have perhaps conveyed, this is going to be a lot of effort — there’s basically an unbounded amount of work that could go into writing documentation, fleshing out software like daschlab, and other support activities — let alone actual improvement of the underlying data products. But DR7 also needs to come out in a finite amount of time. I can already tell that it’s going to be difficult to draw the line at which the polishing has to stop and the thing simply has to get out the damn door. Regardless of where exactly that line gets drawn, there’s absolutely no chance that DR7 will completely plumb the depths of the DASCH data. You could spend the rest of your life improving and expanding the analysis of a dataset as large and heterogeneous as DASCH. Whether we’ll see Data Releases 8, 9, etc. off into the future is, as ever, going to depend on money. If you ask me, a dataset as rich and unique as DASCH certainly deserves lots of funding to analyze and enhance it — and that’s not even considering its cultural and historical implications — but lots of projects deserve funding. It’s also important to underline that “DASCH-able plates” represents only a portion of the Plate Stacks holdings. About 430,000 plates have been scanned for DASCH, compared to an estimated 550,000–600,000 plates in the entire collection. This gap exists because DASCH explicitly focused on plates suitable for astronomical photometric and time-domain analysis. Plates that did not fit this definition include solar, lunar, and eclipse observations, and above all, the spectrum plates, containing the hundreds of thousands of stellar spectra studied by Annie Jump Cannon. These plates might not be useful for time-domain photometry, but that is not at all to imply that they're not valuable in their own right. For instance, it’s believed that the spectrum plates could provide unique insight into climate change by measuring the gradual changes in the absorption lines that Earth’s atmosphere imprinted on the data. But wait, there’s more! The Plate Stacks holdings also include film, written materials, and other artifacts. We can consider DASCH to be one tentpole among many in the broader effort to digitize the holdings of the Plate Stacks. It was certainly the tallest tentpole, but others remain. Heading in a different direction, we can look beyond the Plate Stacks collection. The DASCH scanner</a> is still going strong, and I think it’s fair to say that in many ways it remains the highest-quality plate scanner in the world. It’s very tempting to consider digitizing other plate collections. This presents logistical challenges: the scanner is essentially immovable, but large collections of glass plates are difficult and expensive to transport as well. It’s also true that the scanner is a one-of-a-kind piece of hardware, analogous to a telescope instrument. Many of its components are virtually irreplaceable — and getting quite old. If you wanted me to promise that the system will keep working for the next decade, you’d have to give me a fairly significant sum of money to invest in documenting the existing setup, assessing potential failure modes, and developing recovery plans. Granting that baseline risk, the Plate Stacks are already in possession of a few external plate collections whose digitization with the DASCH scanner we hope to demonstrate. We have copies of POSS-I</a>, the ESO Southern Sky Survey (which appears to be surprisingly undocumented, but see Schuster in Messenger, 1980</a>), the SERC</a> Equatorial Red survey (SERC-ER) and associated AAO</a> Second-Epoch Survey (AAO-SES</a>, aka AAO-R), and the Palomar “Quick V” (Pal-QV, Lasker+ 1990</a>) survey done in support of the Hubble Guide Star Catalog project. Together these form a substantial fraction of the plates that went into building the Digitized Sky Survey</a>. I’d love to compare DASCH results to the DSS data! This could dovetail beautifully with my longstanding desire to upgrade the “DSS Terapixel” map that’s the default optical basemap for WorldWide Telescope</a>, which is generally great but has a few longstanding issues (most notably for scientists, a few-arcsecond global astrometric offset). What’s kind of amazing is that, as far as I’ve been able to figure, digitizations of these photographic surveys still, in the year 2024, represent the best way to obtain a deep, high-resolution optical map of every last nanosteradian of the sky. Modern surveys go awfully deep and awfully wide, but I’m not aware of a set of surveys that can be homogeneously combined across the entire sphere. To try to make concrete the sophistication of the DASCH system — not just the scanner, but the people and processes surrounding it — I’ll note that we could probably digitize all of POSS-I in a couple of good weeks. Finally, a call to action: these are things that I would like to see happen, but in truth probably none of them are going to occur (except DR7) without community support and collaboration. If any of these topics are at all interesting to you, please let me know! I would genuinely love to help other groups launch their own projects based on the DASCH data, the scanner, or other elements of the Plate Stacks collection. Beta-Testing DASCH “DRnext” Thu, 21 Mar 2024 11:06:05 -0400 The work on DASCH</a> continues to move forward! Yesterday, I posted a first draft of a new set of resources for astronomers. These are collected under the DASCH DRnext</a> moniker and are now ready to be checked out. The fact underlying the DRnext</a> designation is that while DASCH has historically had a series of “data releases” (DRs), they weren’t really releases in the usual sense. Normally a DR is associated with a specific set of immutable artifacts, so that if you choose to work with, say, SDSS DR12</a>, you’ll always get exactly the same results if you repeat the same queries. But DASCH only has one set of data servers, and they're always being updated as scanning proceeds and the data processing gets refined, so we’re unable to provide locked-in artifacts. Historically, the DASCH DRs were basically about lifting restrictions on public access to certain portions of the sky. It would be nice to be able to provide traditional, immutable DRs, but with the current resources and system architecture that’s not feasible. In the meantime, the “DRnext” label is the place where the latest-and-greatest DASCH docs and tools will accumulate — sort of like the main</code> branch of a repository as opposed to a versioned release. So, what’s just landed on main</code>? The centerpiece of DRnext is an effort to deliver better tools for scientific data analysis via daschlab, the Python package that I’ve mentioned here</a> a few times</a>. In support of this, we now have: The demo video I posted about earlier</a></li> Python API reference docs</a> (which I also mentioned earlier)</li> A tutorial slideshow</a> that lets you work through the notebook shown in the video, via a MyBinder notebook</a></li> The beginnings of a lightcurve reduction cookbook</a> based on daschlab (although there's no reason you couldn’t use the same techniques in a different data-analysis system)</li> Instructions for installing daschlab locally</a></li> </ul> Integrated with this are several other new resources: An Introduction to DASCH slideshow</a> aimed at astronomers</li> A cloud-powered sample “quicklook” notebook</a> aiming to provide an alternative to the Cannon web-based plotting tools</a></li> Reference documentation of the DASCH lightcurve table columns</a></li> Thorough cross-referencing to the newly-launched StarGlass</a> website and API where appropriate</li> Reorganizing the existing material to hopefully make it more manageable</li> </ul> If you look at the DRnext landing page</a>, I have very plainly used the Diátaxis (née Divio) documentation model</a> for organizing things. This is the first time that I’ve put together a landing page with an explicit Tutorial/How-To/Explainer/Reference breakdown; in this particular case, at least, I think it works well. (All of this might also help explain why I’ve been thinking about</a> digital documentation lately</a>, although that’s something I do pretty often regardless.) I’ve tried a couple of new things in assembling this documentation. One is the way that I’ve presented the “Introduction to DASCH” material</a> as a web-based slideshow; historically I would have written this kind of material up as a brief document. I find it a little painful to admit, but more and more I feel like slide decks are a good way to deliver this kind of information; the pagination breaks things up into digestible chunks in a way that’s easier to approach than even a relatively short single-column presentation. (Drive-by hot take: in practice if not stated intent, Knuth’s literate programming</a> is basically a way to PowerPoint-ify source code.) I would hate to only deliver this kind of slideshow in a properietary format unrecognized by the browser, like a PPTX file, but since I can deliver the slides seamlessly using reveal.js</a>, I’m much more comfortable making them the exclusive publication format for this information. Building on that, the daschlab RY Cnc tutorial</a> is delivered as yet another slideshow, one in which most of the slides contain video clips showing how the software is used interactively. The idea is that people can follow along the slides, and play and replay the clips, while using the notebook in a separate browser window. I think that video examples are super important for showing people how to use the highly interactive daschlab software; hopefully the slide-based presentation once again breaks things into nicely digestible chunks. It turned out to be relatively easy to record one long screencast of myself running through the notebook, then to use Kdenlive</a> to slice it up into short clips suitable for each slide. Finally, I’m hopeful that the “cloud quicklook notebook”</a> idea will provide a decent alternative to DASCH’s legacy web-based lightcurve plotting tool. While the legacy tool is not super featureful, can produce genuinely misleading output, and is nigh-unmaintainble, it is clearly really important to give people a way to peek at some data right in their web browser, and that’s something that’s generally tricky with Jupyter/code-oriented tools. I’m crossing my fingers that a lightweight notebook can meet that need, and hopefully the “quicklook” framing will help people see it in the right light. There are plenty of systems aiming to integrate this kind of notebook into web sites/apps in a much smoother way, but I don’t think that DASCH has the resources to pursue them right now. So for the time being the quicklook is based on MyBinder</a>, which is a great service, especially because it’s free … but it’s slow, and not always reliable. Even if we don’t add some slick Solara-type</a> integration where you don’t even realize that you’re running code, I think that the UX would feel a lot better if the notebook could reliably spin up in just five or ten seconds. One of the next things I want to do is look into setting up daschlab on a science platform service like SciServer</a>, or something that can hopefully offer ease-of-use comparable to MyBinder with better performance. Here at the CfA, there’s a vision of building a “Nexus” science platform (based on Rubin</a>). It’s still in the early stages, so it’s not something that I expect to be able to integrate any time soon, but it would be an ideal home for this kind of capability. Digital Documents have APIs Thu, 14 Mar 2024 10:25:41 -0400 Following up on last week’s viral sensation Digital Documents are Web Applications</a>, I want to add another idea into the mix. Here’s an additional thing to keep in mind about digital technical documents: yes, they’re incarnated as web applications. But unlike many web applications, they also expose nontrivial APIs — by which I mean that they expose interfaces aimed not just at humans (the user interfaces) but ones aimed at computers as well. In particular, the primary API exposed by a digital technical document is its cross-referencing structure. Consider the reference documentation for a Python library like Numpy</a>. It describes various symbols exposed by the package, like numpy.hstack()</code></a>. If I’m documenting my own Python library, I might want to reference that piece of documentation from my own text. Any reference like this implies a sort of contract — I’m expecting, or at least hoping, that the URL embedded in the link above will deliver you to documentation about the numpy.hstack()</code></a> function and that it will continue doing so into the future. You could quibble about the terminology, but I’m happy to refer to any sort of contract between two distinct digital systems as an API — and that’s what we have here. (The bare word “interface” might be better, except that I think it too easily conjures up the idea of a user interface, which is specifically not what I want to focus on.) My document depends on a thing that I can call “the Numpy docs”, and in particular it depends on the Numpy docs providing a thing called “the numpy.hstack()</code> docs”. The most fundamental API offered by any web document is its URL structure: what links into it are valid? Through this lens we can see linkrot</a> as a sort of API break, which feels right to me: you told me that I could rely on this thing, and now you’ve gone and taken it away. Please don’t do that! URLs, however, are semantically opaque in the aggregate. And I’d argue that one characteristic of technical documents is that they tend to expose multiple referencing structures that wish to be semantically rich. Think of a long scientific article. Discussing its contents, I might want to refer to Figure 1, Figure 2, Figure 3; or Table 1, Table 2, Table 3; or Equation 1; or Section 2; or Reference #3 (if its references are numbered); and so on. While each of these references can likely be resolved — flattened — into a specific URL in the case of a digital article, as an author I want to work at a higher level. But the phrase “want to” isn’t strong enough. In the technical sphere, the ability to make reliable, semantically-rich references among documents is essential functionality. People usually agree that scientists use TeX because of its ability to render equations, but I’ll claim that BibTeX</a> is just as important. The ability to write \citep{latest.astropy}</code> and get a properly formatted reference (both in-text and metadata in the References section at the bottom) is huge. I’m happy to go even farther and argue we can think of the institutional architecture of worldwide academic publishing is being largely designed to promote reference-ability. If you hand me a journal name, volume, and page number, I can probably locate the article that you’re talking about, even if it was published a century ago; contrast that with the durability of your typical web link. I’ve seen people talk about how impressed they are by how academics do such a good job of citing prior work, and I think that you can explain a lot of that as being because we have designed our entire profession to make this kind of citation robust and convenient. The framework of citation allows us to build a (mostly) coherent intellectual edifice out of individual workers’ labor. And we see the same phenomenon in software! As programming languages have evolved, they’ve tended to provide increasingly sophisticated and reliable systems for expressing and resolving dependencies between independent pieces of software: from downloading libraries from random websites, to CPAN</a> to PyPI</a> to NPM</a> and Cargo</a>. Just as in the case of scholarship, these affordances unlock scalability. While it’s hard to build a C project with more than a handful of external dependencies, Python projects easily have dozens, and a Rust project like Tectonic</a> has hundreds. The key distinction between software and scholarship is the semantic richness of the interfaces between items. Scholarly citations are famously about as empty as you can get: if you write a paper saying “Williams et al. (2020) is garbage” or “Williams et al. (2020) is brilliant”, it’s a citation either way. Meanwhile, the APIs that capture the relationship between software components are complex and getting moreso all the time: from function prototypes and header files in C to complex type systems with public/private visibility annotations in languages like Rust or Go. Technical documentation lags behind all of this. Within the framework of a specific programming language, cross-referencing is usually well-supported: tools like Intersphinx</a> enable this in Python, and rustdoc</a> has it built in. But if I want to reference a Python method from the documentation of a Rust project? Or if I want to link to a particular passage of an online manual in a way that will keep working when the manual’s authors inevitably change their URL structure? I’m on my own. For all of the reasons given above, I think that the lack of infrastructure in this area really limits our ability to create great digital technical documents. When it comes to declaring and resolving dependencies, we probably have the baseline of what we need — namely, URLs and the web. But right now we don’t have the tools to make explicit the “APIs” (cross-referencing structures) exposed by documents, and in my view, this lack makes it so that we don’t do a good job of rationalizing those APIs or monitoring compatibility. Further, it prevents us from building authoring tools that can span multiple technical silos. This feels to me like a very solvable problem. Sneak Peek: daschlab Thu, 07 Mar 2024 10:53:26 -0500 Recently</a> I mentioned that I’ve been working on a Python package called daschlab</a>, which will be the recommended analysis toolkit for DASCH</a> data. It’s designed for interactive data exploration, so I thought that I’d make a video giving a sense of what it’s like. Here it is! </iframe> </div> </div> I haven’t yet written up a lot of the needed documentation, but if you’re feeling adventurous you can play with it today. Besides local installation, this MyBinder link</a> will load up a JupyterLab environment with daschlab installed. Any data that you download won’t persist, but you should be able to try a few things out. There‘s also some API reference documentation</a>, but these docs are intentionally low-level; there no installation instructions, tutorials, etc. That kind of stuff will end up on the main DASCH website</a>. In the course of making this video I went through my quasi-annual revisitation of what it takes to do efficient desktop video capture on Linux; a lot of that stuff is extremely hardware- and distribution- specific, but if you want to see what a glutton for punishment I am, the gory details are documented on my Capture Video Efficiently on Linux</a> and Get a Standardized Browser Window</a> HOWTOs. Get a Standardized Browser Window Wed, 06 Mar 2024 12:08:06 -0500 To demonstrate an interactive webapp, it can be very helpful to record a screencast of its usage. If you just pop open your day-to-day web browser and record something, it might be fine, but there are two concerns. First is privacy/professionalism: if you type something in the URL bar, say, it will likely autocomplete items from your personal history. Second is repeatability: if you’re making a series of related videos, or updating an existing one, ideally the browser environment would change as little as possible; the window size in particular should stay the same. Here are some steps to create a repeatable browser environment using Firefox. Nothing here is magic; there are surely other approaches that would also work, but this is what I do. The key is that Firefox takes a command-line argument --profile <path></code> that lets you specify a custom input profile directory. The basic idea is to set up an extremely generic browser profile in a custom directory, then archive it. Every time you want to record a video, unpack a copy of the archive, use it, and then throw it away. This approach can of course be extended: you can create any number of standardized profiles, and they can be tweaked to save logins to relevant webservices, use project branding, etc. Creating a Standardized Browser Profile🔗</a></h1> In some kind of work directory, mkdir standard</code></li> Run firefox --profile standard</code></li> Skip all personalization steps.</li> Ctrl-Shift-B to hide the bookmark bar.</li> Remove superfluous icons/features from the address bar.</li> Right-click to “Customize Toolbar” and remove more stuff, and flexible spaces around the address bar.</li> Open up the Settings screen.</li> Set homepage and new tab page to Blank.</li> Disable all “Home” content.</li> In Search, set the default engine to Wikipedia.</li> Disable search suggestions and address bar suggestions.</li> Remove as many default search shortcuts as possible.</li> In Privacy & Security, turn off “Ask to save passwords”.</li> Turn off Autofill options.</li> Turn off popup blocking (seems like the better default for this use case?).</li> Follow the steps for standardizing the window size below.</li> Quit Firefox.</li> In a terminal, run sqlite3 standard/places.sqlite</code>. We’ll clear some tables. The list here comes from looking at the schema and checking which tables are non-empty by default. DELETE FROM moz_origins;</code></li> DELETE FROM moz_places;</code></li> DELETE FROM moz_historyvisits;</code></li> DELETE FROM moz_bookmarks;</code></li> DELETE FROM moz_keywords;</code></li> DELETE FROM moz_anno_attributes;</code></li> DELETE FROM moz_annos;</code></li> DELETE FROM moz_places_metadata;</code></li> </ol> </li> rm -rf standard/cache2</code></li> Tar up standard</code> and save the resulting archive somewhere.</li> </ol> Using the Standardized Profile🔗</a></h1> Unpack the archived standard</code> tree</li> Run firefox --profile standard</code></li> If in doubt, follow the steps for standardizing the window size below.</li> </ol> Standardizing the Window Size🔗</a></h1> Ctrl-Shift-I to open devtools</li> Pop the devtools out to their own window</li> F1 to enable devtool options</li> Activate “Toggle rulers for the page”</li> Click the new right-angle ruler icon in the devtools header</li> Resize the window so that the content area is 1280×875 px (see below)</li> Close devtools</li> </ol> The size here is chosen so that the final window with toolbars and decorations has dimensions of 1280×960, which is a 4:3 aspect ratio. You can figure out the padding by setting the browser content area to a reasonable size, then taking a screenshot of the browser window and looking at the size of the resulting image. However, when using X.org, which is necessary to efficiently capture video, the final window size includes a fair amount of transparent blank space surrounding the actual window — I believe this is used for drop shadows and such. I can configure OBS to crop away this border, and this is desirable to avoid black borders around the video edge. Also, if the window is on a HiDPI display, various pixel sizes are doubled — we'd be aiming for 2560×1920. Currently, on my laptop’s display, which is HiDPI, the X.org window border consists of 46 px at top (in HiDPI pixels), 52 px left and right, and 58 px at bottom. This means that the target window size we're actually going for, in terms of what a screenshot will yield, is 2664×2024. On an attached regular-DPI monitor, all of these measurements are halved. Compared to the devtools content-area size readout, the cropped window size has the same width, but is 85 non-HiDPI px taller, at the moment. Bonus: Setting up OBS🔗</a></h1> The primary use case for all this is making OBS</a> video captures. Here are some notes about the setup of that. See the video capture howto</a> for notes about proper video encoding settings. In theory that is all separate from the issues considered here.</li> Set up a “window capture” device to capture the browser window.</li> Set it up to crop as above: 46 / 52 / 58 / 52 px (in CSS ordering) if HiDPI; half those values if not. There is a "source screenshot" functionality that you can use to check the results. (Crop one pixel smaller on each axis and open in Gimp to verify that there is just 1px of border on all edges.)</li> Use the “Edit Transform” (Ctrl-E) box to check the size of the stream, post-crop.</li> Use "Resize output (source size)" to match the output to the window.</li> If you're going to overlay a webcap feed, how about putting it in the top-right, 24 px from the edges? And giving it a size of 412×232 px? (These measurements assuming HiDPI).</li> Finally, use ffprobe</code> to check the pixel dimensions of a test recording.</li> </ol> See Also🔗</a></h1> How to capture video efficiently on Linux</a></li> </ul> Digital Documents are Web Applications Fri, 01 Mar 2024 09:41:38 -0500 I spend a lot of time thinking about digital technical documents. A computer’s screen is a more capable medium than the printed page: can we use that capability to actually communicate more effectively? To date, the answer appears to be flatly “no”. To an incredible extent, scientists’ preferred digital format for technical material is the PDF file, a completely print-oriented format. I’m not saying that scientists are wrong to use PDFs so much. Indeed, the fact that they do (unlike almost everyone else!) tells us something profoundly important. But it seems like we should be able to do better. And, like it or not, the alternative is clear: when I read a news story, an essay, or a recipe, I’m reading it in a web browser. The latest irreverent sports blog from Defector</a> may not always feel like a “document”, but that’s a totally valid way to look at it. Through that lens, we can see it as a document targeting the web platform</a> as its “output format” rather than something like PDF. The web platform is a complete mess, but (because) it benefits from billions of dollars of annual investment. You can use it to play videos, 3D games, analyze data, and, yes, read. Maybe things will change one day, but for the foreseeable future, it’s an inescapable conclusion that if we want to create digital technical documents that anyone actually uses, we need to target the web. More recently, I’ve come round to thinking that we should go a bit farther and try to start thinking of digital documents as web applications. This idea runs into a bit of an early roadblock because I’m not sure how exactly I would draw the line between a web application and a web … not-application. But it feels like a real and meaningful distinction. On the not-application side, we can have things like traditional latex2html</a> output: static HTML, CSS, asset files, and little or no JavaScript. On the other end of the spectrum, we have Jupyter, Google Maps, and all the rest. My intuition says that somewhere in the middle there’s a phase transition. Practically, it’s probably the point at which you start building your content using the web dev stack (NPM, bundlers, etc.) rather than hand-coding your HTML/CSS/JS as a bunch of static files. In this schema, it’s intuitive to want to put the things that we think of as “documents” on the not-app (let’s say “static content”) end of the spectrum, and not without reason. My problem with this is that you cut yourself off from a whole world of possibilities if you think of your document as an HTML file with some CSS and JS sprinkled on top. Instead of asking yourself “I can do anything that the web platform permits — what should I do?”, you end up asking yourself, “What can I do that won’t be too hard?” Consider search. One of the key benefits of having a digital document is that you can do full-text searches instead of having to consult an index. It’s great! And the longer your document is, the more valuable a search feature will be. But if your document isn’t one giant webpage</a>, the browser’s find-in-page feature isn’t going to suffice. So it’s reasonable, perhaps even incumbent, to implement a search UI. How are we going to show results? Preview snippets would be nice. So would complex queries. And history. Do we want to hand-code this all ourselves in low-level HTML/CSS/JS? The mdBook</a> system used by a lot of Rust documentation does, and when I look over that code, I say to myself … yeah, this is why people adopt web development frameworks. The same considerations apply in other areas, like interactive figures and tables, navigation, embedded computation, or theming. I can imagine a lot of really cool features in these areas and I do not want to implement them all using the lowest level of tooling. Getting a bit more ambitious, we can also start envisioning how the structure of a document could become more dynamic. I was once at a conference session about Python documentation tools where people got into a big discussion about whether it was better to have each function documented on its own page, or to group chunks of related methods on one big page. Why do we have to pick one? If we think of our documentation as being a bunch of static files, anything else is a pain. But if we think of ourselves as building a web app, it’s not like there’s less work to do, but now we have a problem-solving vocabulary to bring to bear. OK, if we want to provide both options, what will the UX design be like? How to we need to structure the backend to implement that design? All this being said, digital documents are also unusual web applications. It is essential that documents are durable and can stand alone: if I deliver a document as a web app that becomes unusable if some third-party service goes down, I’ve done something wrong. They’re a perfect fit for the local-first</a> web design mentality. Likewise, when I describe a document as a “web app”, I don’t mean that it ought to have a million buttons, animations, and widgets everywhere. It’s reasonable to think of a document-as-app as having a “user interface”, but it’s one that should consist of almost all text, almost all of the time. The app-ness is about the sophistication of the tooling on the backend, not the apparent complexity of the frontend. The biggest challenge to this vision is undoubtedly archiving. I can save a PDF article as a file on disk with the confidence that my grandchildren will be able to read it if they want. Even if we ignore things like dependencies on third-party services, how can I possibly save a document-app with anything like the same level of confidence? At the most basic level, there seems to be no agreed-upon standard for archiving a website to disk — WARC</a> seems to be the leader, but to me it’s telling that none of the browsers have anything like a built-in “open WARC file” functionality. We can still work on creating state-of-the-art digital technical documents without solving this problem, but until that happens, the PDF isn’t going anywhere.

PKGW

DASCH Data are Now Queryable In VO Tools

A Modest Proposal for How to Do Digital Conference “Posters”

The Proposal🔗</a></h1> The digital equivalent of a traditional conference poster shall be called a “six-pack”.</li> It shall consist of a slide deck containing no more than six slides.</li>

A Tool and Workflow for Radio Astronomical “Peeling”

The xz Backdoor and Release Automation

DASCH Scanning is Complete — What’s Next?

Beta-Testing DASCH “DRnext”

Digital Documents have APIs

Sneak Peek: daschlab

Get a Standardized Browser Window

Using the Standardized Profile🔗</a></h1> Unpack the archived standard</code> tree</li> Run firefox --profile standard</code></li>

See Also🔗</a></h1> How to capture video efficiently on Linux</a></li> </ul>

Digital Documents are Web Applications

See Also 🔗</a></h1>

How to capture video efficiently on Linux</a></li> </ul>