Proceedings Volume 4847

Astronomical Data Analysis II

cover
Proceedings Volume 4847

Astronomical Data Analysis II

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 19 December 2002
Contents: 7 Sessions, 47 Papers, 0 Presentations
Conference: Astronomical Telescopes and Instrumentation 2002
Volume Number: 4847

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Cosmology, CMB
  • Galaxy Distribution and Surveys
  • Poster Session
  • Resolution, PSF, and Deconvolution
  • Data Modeling
  • Poster Session
  • Data Modeling
  • Poster Session
  • Data Mining: Information Retrieval
  • Poster Session
  • Pipeline: Software and System
  • Poster Session
  • Cosmology, CMB
Cosmology, CMB
icon_mobile_dropdown
Observed correlations, evolution, and environment dependence of 9000 early-type galaxies in the SDSS
A sample of nearly 9000 early-type galaxies, observed in the g*,r*,i*, and z* bands, was selected from the Sloan Digital Sky Survey using morphological and spectral criteria. The sample spans the redshift range 0 < z < 0.3, and was used to study how early-type galaxy observables, including luminosity, effective radius, surface brightness, color, velocity dispersion, and chemical abundances are correlated with one another, how they evolve, and whether they depend on environment. Relative to the population at z~0.1, the median redshift of the sample, galaxies at lower and higher redshifts have evolved little. The luminosities and colors, the Fundamental Plane, and absorption- line strengths (obtained from co-added spectra of similar objects) suggest that the population is evolving passively, having formed the bulk of its stars about 9 Gyrs ago. While the Fundamental Plane suggests that galaxies in dense regions are slightly different from galaxies in less dense regions, the chemical abundances and color-velocity dispersion relations show no statistically significant environmental dependence. That we are able to measure velocity dispersions at all is a tribute to the quality of the SDSS spectrographs: they exceed the goals for which they were designed.
Planck data processing centers
Fabio Pasian, Jean-François Sygnet
The success of the Planck mission heavily relies on careful planning, design and implementation of its ground segment facilities. Among these, two Data Processing Centres (DPCs) are being implemented, which are operated by the consortia responsible for building the instruments forming the scientific payload of Planck. The two DPCs, together with the Mission Operations Centre (MOC) and the Herschel Science Centre (HSC), are the major elements of the Herschel/Planck scientific ground segment. The Planck DPCs are responsible for the operation of the instruments, and for the production, delivery and archiving of the scientific data products, which can be considered as the final results of the mission: · Calibrated time series data, for each receiver, after removal of systematic features and attitude reconstruction. · Photometrically and astrometrically calibrated maps of the sky in the observed bands. · Sky maps of the main astrophysical components. · Catalogs of sources detected in the sky maps of the main astrophysical components. · CMB power spectrum coefficients. During the development phase, the DPCs are furthermore responsible for the production of data simulating realistically the behaviour of the instruments in flight, and for the support to instrument testing activities. In this paper, some aspects related to the control of the Planck instruments, to the data flow and to the data processing for Planck are described, and an overview of the activities being carried out is provided.
Spatial statistical analysis of large astronomical datasets
The future of astronomy will be dominated with large and complex data bases. Megapixel CMB maps, joint analyses of surveys across several wavelengths, as envisioned in the planned National Virtual Observatory (NVO), TByte/day data rate of future surveys (Pan-STARRS) put stringent constraints on future data analysis methods: they have to achieve at least N log N scaling to be viable in the long term. This warrants special attention to computational requirements, which were ignored during the initial development of current analysis tools in favor of statistical optimality. Even an optimal measurement, however, has residual errors due to statistical sample variance. Hence a suboptimal technique with significantly smaller measurement errors than the unavoidable sample variance produces results which are nearly identical to that of a statistically optimal technique. For instance, for analyzing CMB maps, I present a suboptimal alternative, indistinguishable from the standard optimal method with N3 scaling, that can be rendered N log N with a hierarchical representation of the data; a speed up of a trillion times compared to other methods. In this spirit I will present a set of novel algorithms and methods for spatial statistical analyses of future large astronomical data bases, such as galaxy catalogs, megapixel CMB maps, or any point source catalog.
Detection of compact sources with multifilters
D. Herranz, Jose Luis Sanz, R. B. Barreiro, et al.
We present scale-adaptive filters that optimize the detection/separation of compact sources on a background. We assume that the sources have a multiquadric profile and a background modeled by an homogeneous and isotropic random field characterized by a power spectrum. We make an n-dimensional treatment but consider two interesting physical applications related to clusters of galaxies (Sunyaev-Zel'dovich effect and X-ray emission). We extend this methodology to multifrequency maps, introducing multifilters that optimize the detection on clusters on microwave maps. We apply these multifilters to small patches (corresponding to 10 frequency channels) of the sky such as the ones that will produce the future ESA Planck mission. Our method predicts a number of ≈10000 clusters in 2/3 of the sky, being the catalog complete over fluxes S > 170mJy at 300GHz.
Tools to detect non-Gaussianity in nonstandard cosmological models
Julio E. Gallegos, Enrique Martinez-Gonzalez, Francisco Argueso, et al.
We present a method to detect non-Gaussianity in CMB temperature fluctuations maps, based on the spherical Mexican Hat wavelet. We have applied this method to artificially generated non-Gaussian maps using the Edgeworth expansion. Analysing the skewness and kurtosis of the wavelet coefficients in contrast to Gaussian simulations, the Mexican Hat is more efficient in detecting non-Gaussianity than the spherical Haar wavelet for all different leves of non-Gaussianity introduced. These results are relevant to test the Gaussian character of the CMB data. The method has also been applied to non-Gaussian maps generated by introducing an additional quadratic term in the gravitational potential.
Curvelet and wavelet performances to detect cosmological non-Gaussian signatures
Jean-Luc Starck, Nabila Aghanim, Olivier Forni
One of the goals in cosmology is to understand the formation and evolution of the structures resulting from the growth of initial density perturbations. Recent Cosmic Microwave Background (CMB)observations indicate that these pertubations essentially came out of Gaussian distributed quantum fluctuations in the inflationary scenario. However, topological defects (e.g. cosmic strings) could contribute to the signal. One of their important footprints would be the predicted non-Gaussian distribution of the temperature anisotropies. In addition, other sources of non-Gaussian signatures do contribute to the signal, in particular the Sunyaev-Zel'dovich effect of galaxy clusters. In this general context and motivated by the future CMB experiments, the question we address is to search for, and discriminate between, different non-Gaussian signatures. To do so, we analyze simulated maps of the CMB temperature anisotropies using both wavelet and curvelet transforms. Curvelets take the form of basis elements which exhibit very high directional sensitivity and are highly anisotropic, which is not the case for wavelets. The sensitivity of both methods is evaluated using simulated data sets.
Galaxy Distribution and Surveys
icon_mobile_dropdown
Clustering statistics in cosmology
Vicent Martinez, Enn Saar
The main tools in cosmology for comparing theoretical models with the observations of the galaxy distribution are statistical. We will review the applications of spatial statistics to the description of the large-scale structure of the universe. Special topics discussed in this talk will be: description of the galaxy samples, selection effects and biases, correlation functions, Fourier analysis, nearest neighbor statistics, Minkowski functionals and structure statistics. Special attention will be devoted to scaling laws and the use of the lacunarity measures in the description of the cosmic texture.
Multiscale geometric analysis for 3D catalogs
David L. Donoho, Ofer Levi, Jean-Luc Starck, et al.
We have developed tools for analysis of 3D volumetric data which allow sensitive characterizations of filamentary structures in 3D point clouds. These tools rapidly compute multiscale X-ray transforms of the data volume. Subcubes of varying locations and scales are extracted from the data volume and each is analyzed by integrating along a strategically chosen set of line segments covering all different orientations. The underlying motivation is that point clouds with different degrees of filamentarity will lead to multiscale X-ray coefficients having different distributions when viewed at the right scale. The multiscale approach guarantees that information from all scales is available; by extracting the information from the transform in a statistically appropriate fashion, we can sensitively resolve differences in details of the filamentarity. We will describe the algorithm and the results of comparing different simulated galaxy distributions.
Wide-field cosmic shear surveys
Yannick Mellier, L. van Waerbeke, Etienne Bertin, et al.
We present the current status of cosmic shear based on all surveys done so far. Taken together, they cover more about 70 deg2 and concern more than 3 million galaxies with accurate shape measurement. Theoretical expectations, observational results and their cosmological interpretations are discussed in the framework of standard cosmology and CDM scenarios. The potentials of the next generation cosmic shear surveys are discussed.
New algorithms and technologies for the unsupervised reduction of optical/IR images
This paper presents some of the main aspects of the software library that has been developed for the reduction of optical and infrared images, an integral part of the end-to-end survey system being built to support public imaging surveys at ESO. Some of the highlights of the new library are: unbiased estimates of the background, critical for deep IR observations; efficient and accurate astrometric solutions, using multi-resolution rechniques; automatic identification and masking of satellite tracks; weighted-coaddition of images; creation of optical/IR mosaics, and appropriate management of multi-chip instruments. These various elements have been integrated into a system using XML technology for setting input parameters, driving the various processes, producing comprehensive history logs and storing the results, binding them to the supporting database and to the web. The system has been extensively tested using deep images as well as crowded fields, (e.g., globular clusters LMC), processing at a rate of 0.5 Mega-pixels per second using a DS20E ALPHA computer with two processors. The goal of this presentation is to review some of the main features of this package.
Multiwavelength data mining of the ISOPHOT serendipity sky survey
Manfred Stickel, Dietrich Lemke, Ulrich Klaas, et al.
The ISOPHOT C200 stressed Ge:Ga array aboard the Infrared Space Observatory was used to carry out scientific observations while the telescope was moved from one target to the next. These strip scanning measurements of the sky in the far-infrared (FIR) at 170 μm comprise the ISOPHOT Serendipity Survey, the first slew survey designed as an integral part of a space observatory mission. The ISOPHOT Serendipity Survey is the only large scale sky survey in the unexplored wavelength region beyond the IRAS 100 μm limit to date. Within nearly 550 hours more than 12000 slew measurements with a total slew length of more than 150000 degrees were collected, corresponding to a sky coverage of about 15%. The slew data analysis has been focused on the detection of compact sources, which required the development of special algorithms. A severe problem at 170 μm is the confusion of genuine compact sources with foreground galactic cirrus knots and ridges. The selection and identification of objects therefore necessarily requires a multi-wavelengths approach, which makes use of a broad variety of additional data from databases and other surveys. Known galaxies were identified by cross-correlating the Serendipity Survey source positions with galaxy entries in the NED and Simbad databases and a subsequent cross-check of optical images from the Digital Sky Survey. A large catalogue with 170 μm fluxes for ≈2000 galaxies is being complied. The particularly interesting rare galaxies with very cold dust and very large dust masses further require additional FIR data from the IRAS survey as well as measured redshifts. A large fraction of the compact galactic structures are prestellar cores inside cold star forming regions. Early stages of medium and high mass star forming regions are identified by combining compact bright and cold Serendipity Survey sources with the near-infrared 2MASS and MSX surveys, the combination of which indicates large dust masses in conjunction with embedded young stars of early spectral types. In all the studied samples of different object classes the 170 μm flux provides the crucial data point for a complete characterization of the FIR spectral energy distributions and the derivation of total dust masses. Follow-up observations are underway to study selected objects in more detail.
Poster Session
icon_mobile_dropdown
ISO long-wavelength spectrometer parallel mode
Tanya L. Lim, Florence Vivares, Bruce Miles Swinyard, et al.
The Infrared Space Observatory (ISO) had a scientific payload of four complementary instruments, a camera, a photometer, and two spectrometers. Two instruments, the Long Wavelength Spectrometer (LWS) and the Camera were able to operate in parallel mode i.e. taking scientific data while another instrument was active. The LWS had ten detectors which allowed simultaneous coverage of the entire 43-197 micron range. In parallel mode the diffraction grating remained in a fixed position allowing the spectrometer to act like a 10 channel photometer with bandwidths of 0.3 microns (one resolution element in second order) for the five short wavelength detectors and 0.6 microns (one resolution element in first order) for the five long wavelength detectors. This paper describes the LWS parallel mode and gives details on the data was obtained. The paper will also describe the automated processing developed for the parallel mode data and the calibration strategy employed. The parallel data has very good sparse coverage in the Rho Oph region and a temperature map derived from the parallel mode data are presented.
Analysis of the galaxy distribution using multiscale methods
Philippe Querre, Jean-Luc Starck, Vicent J. Martinez
Galaxies are arranged in interconnected walls and filaments forming a cosmic web encompassing huge, nearly empty, regions between the structures. Many statistical methods have been proposed in the past in order to describe the galaxy distribution and discriminate the different cosmological models. We present in this paper preliminary results relative to the use of new statistical tools using the 3D a trous algorithm, the 3D ridgelet transform and the 3D beamlet transform. We show that such multiscale methods produce a new way to measure in a coherent and statistically reliable way the degree of clustering, filamentarity, sheetedness, and voidedness of a dataset.
Resolution, PSF, and Deconvolution
icon_mobile_dropdown
Deconvolution of astronomical data: What for and where do we go?
Modern astronomy tries to push the observation instruments to their limits in order to discover new unsuspected phenomena in the universe. However, some intrinsic limitations (diffraction or atmospheric seeing etc ...) that degrade strongly the spatial resolution when imaging objects in the sky, and can be hardly reduced simply by technology improvements. In order to get the best scientific return from new, more and more sensitive instruments, image deconvolution methods try to push back these limits. Indeed, when the spatial degradation is known, partly known, or even unknown, deconvolution algorithms have proven to be able to restore an original image that is close, within the errors of the method, to a perfect image as observed by a perfect instrument with no, or heavily reduced image degradation. Many methods and algorithms exist nowadays to solve numerically the problem of image deconvolution. I will review the most popular ones and show their characteristics and limitations. In particular, I will show how we moved from standard methods (which see the images as one unique object) to multiscale methods that analyse the data from a multiresolution point a view, decoupling an originally very complicated problem into a set of problems more easy to solve.
Super-Nyquist sampling strategies
Barry F. Madore, Wendy L. Freedman
For the discovery of period variable stars using the Hubble Space Telescope an efficient and novel search strategy was developed. We present here a brief overview of the algorithm and its advantages as compared to strict Nyquist sampling.
Deconvolution with a spatially variant PSF
Application of deconvolution algorithms to astronomical images is often limited by variations in PSF structure over the domain of the images. One major difficulty is that Fourier methods can no longer be used for fast convolutions over the enitre images. However, if the PSF is modeled as a sum of orthogonal functions that are individually constant in form over the images, but whose relative amplitudes encode the PSF spatial variability, then separation of variables again allows global image operations to be used. This approach is readily adapted to the Lucy-Richardson deconvolution algorithm. Use of the Karhunen-Loeve transform allows for a particularly compact orthogonal expansion of the PSF. These techniques are demonstrated on the deconvolution of Gemini/Hokupa'a adaptive optics images of the galactic center.
Optimization issues in blind deconvolution algorithms
Modern blind deconvolution algorithms combine agreement with the data and regularization constraints into a single criteria (a so-called penalizing function) that must be minimized in a restricted parameter space (at least to insure positivity). Numerically speaking, blind deconvolution is a constrained optimization problem which must be solved by iterative algorithms owning to the very large number of parameters that must be estimated. Additional strong difficulties arise because blind deconvolution is intrinsically ambiguous and highly non-quadratic. This prevent the problem to be quickly solved. Various optimizations are proposed to considerably speed up blind deconvolution. These improvements allow the application of blind deconvolution to very large images that are now routinely provided by telescope facilities. First, it is possible to explicitly cancel the normalization ambiguity and therefore improve the condition number of the problem. Second, positivity can be enforced by gradient projection techniques without the need of a non-linear re-parameterization. Finally, superior convergence rates can be obtained by using a small sub-space of ad-hoc search directions derived from the effective behavior of the penalizing function.
Data Modeling
icon_mobile_dropdown
Monte Carlo methods for x-ray dispersive spectrometers
John R. Peterson, J. Garrett Jernigan, Steven M. Kahn
We discuss multivariate Monte Carlo methods appropriate for X-ray dispersive spectrometers. Dispersive spectrometers have many advantages for high resolution spectroscopy in the X-ray band. Analysis of data from these instruments is complicated by the fact that the instrument response functions are multi-dimensional and relatively few X-ray photons are detected from astrophysical sources. Monte Carlo methods are the natural solution to these challenges, but techniques for their use are not well developed. We describe a number of methods to produce a highly efficient and flexible multivariate Monte Carlo. These techniques include multi-dimensional response interpolation and multi-dimensional event comparison. We discuss how these methods have been extensively used in the XMM-Newton Reflection Grating Spectrometer in-flight calibration program. We also show several examples of a Monte Carlo applied to observations of clusters of galaxies and elliptical galaxies with the XMM-Newton observatory.
The MATPHOT algorithm for digital point spread function CCD stellar photometry
The MATPHOT algorithm for digital Point Spread Function (PSF) CCD stellar photometry is described. A theoretical photometric and astrometric performance model is presented for PSF-fitting stellar photometry. MATPHOT uses a digital representation of the sampled PSF consisting of a numerical table (e.g., a matrix or a FITS image) instead of an analytical function. MATPHOT achieves accurate stellar photometry with under-sampled CCD observations with super-sampled PSFs. MATPHOT currently locates a PSF within the observational model using a 21-pixel-wide damped sinc interpolation function. Position partial derivatives of the observational model are determined using numerical differentiation techniques. Results of MATPHOT-based design studies of the optical performance of the Next Generation Space Telescope are presented; observations of bright stars analyzed with the MATPHOT algorithm can yield millimag photometric errors with millipixel relative astrometric errors -- or better -- if observed with a perfect detector. Plans for the future development of a parallel-processing version of the MATPHOT algorithm using Beowulf clusters are described. All of the C source code and documentation for MATPHOT is freely available as part of the MXTOOLS package for IRAF (http://www.noao.edu/staff/mighell/mxtools). This work is supported by a grant from NASA's Office of Space Science.
Pattern and Transient Removal Algorithms
This paper describes algorithms for the removal of two types of contaminating features in astronomical images. One is background patterns that are spatially fixed (over many exposures) but variable in amplitude. These include fringing and pupil patterns. An algorithm for automatically fitting the pattern amplitude is presented. The second type of contamination consists of non-astronomical (cosmic rays and satellite trails) and astronomical (asteroids) transient sources. An algorithm for detecting and removing these sources from stacked images is also described. Implementations of these algorithms in IRAF are illustrated with data from the NOAO large format mosaic imagers taken by the NOAO Deep-Wide Field Survey (NDWFS).
Poster Session
icon_mobile_dropdown
Proper handling of random errors and distortions in astronomical data analysis
Nicole Cardiel, Javier Gorgas, Jess Gallego, et al.
The aim of a data reduction process is to minimize the influence of data acquisition imperfections on the estimation of the desired astronomical quantity. For this purpose, one must perform appropriate manipulations with data and calibration frames. In addition, random-error frames (computed from first principles: expected statistical distribution of photo-electrons, detector gain, readout-noise, etc.), corresponding to the raw-data frames, can also be properly reduced. This parallel treatment of data and errors guarantees the correct propagation of random errors due to the arithmetic manipulations throughout the reduction procedure. However, due to the unavoidable fact that the information collected by detectors is physically sampled, this approach collides with a major problem: errors are correlated when applying image manipulations involving non-integer pixel shifts of data. Since this is actually the case for many common reduction steps (wavelength calibration into a linear scale, image rectification when correcting for geometric distortions,...), we discuss the benefits of considering the data reduction as the full characterization of the raw-data frames, but avoiding, as far as possible, the arithmetic manipulation of that data until the final measure of the image properties with a scientific meaning for the astronomer. For this reason, it is essential that the software tools employed for the analysis of the data perform their work using that characterization. In that sense, the real reduction of the data should be performed during the analysis, and not before, in order to guarantee the proper treatment of errors.
Automatic processing method for astronomical CCD images
Binhua Li, Lei Yang, Wei Mao
Since several hundreds of CCD images are obtained with the CCD camera in the Lower Latitude Meridian Circle (LLMC) every observational night, it is essential to adopt an automatic processing method to find the initial position of each object in these images, to center the object detected and to calculate its magnitude. In this paper several existing automatic search algorithms searching for objects in astronomical CCD images are reviewed. Our automatic searching algorithm is described, which include 5 steps: background calculating, filtering, object detecting and identifying, and defect eliminating. Then several existing two-dimensional centering algorithms are also reviewed, and our modified two-dimensional moment algorithm and an empirical formula for the centering threshold are presented. An algorithm for determining the magnitudes of objects is also presented in the paper. All these algorithms are programmed with VC++ programming language. In the last our method is tested with CCD images from the 1m RCC telescope in Yunnan Observatory, and some primary results are also given.
Comparison of algorithms for the efficient approximation of heterogeneous multidimensional scientific data
Immanuel Freedman
Many scientists would like to be able to view and analyze quick look astronomical data on hand held devices linked by wireless network to the Internet. Scientific data is often characterized by high dynamic range together with abrupt, localized or extended changes of spatial and temporal statistical properties. I compare the effectiveness of algorithms for the efficient approximation of scientific data that support low bit-rate, near real-time and low-delay communication of heterogeneous multidimensional scientific data over existing or planned wireless networks.
Data Modeling
icon_mobile_dropdown
Toward accurate radial velocities with the fiber-fed GIRAFFE multi-object VLT spectrograph
Frederic Royer, Andre Blecha, Pierre North, et al.
We describe briefly the Data-Reduction of the VLT fiber-fed multi-object GIRAFFE spectrograph - part of the VLT FLAMES facility. We focus on specific features of GIRAFFE - the simultaneous wavelength calibration - and their impact on the data-reduction strategy. We describe the implementation of the global physical model and we compare the results obtained with the simulated, laboratory and preliminary data. We discuss the influence of critical parameters, the overall accuracy of the wavelength solution, and the stability and the robustness of the global model approach. We address the accuracy of radial velocity measurements illustrated by solar spectra obtained during the Preliminary Acceptance in Europe.
Poster Session
icon_mobile_dropdown
DASH--distributed analysis system hierarchy
Masafumi Yagi, Mizumoto Yoshihiko, Ryusuke Ogasawara, et al.
We have developed and are operating an object-oriented data reduction and data analysis system, DASH ( Distributed Analysis Software Hierarchy ), for efficient data processing for SUBARU telescope. In DASH, all information for reducing a set of data is packed into an abstracted object, named as ``Hierarchy''. It contains rules how to search calibration data, reduction procedure to the final result, and also the reduction log. With Hierarchy, DASH works as an automated reduction pipeline platform cooperated with STARS (Subaru Telescope ARchive System). DASH is implemented on CORBA and Java technology. The portability of these technology enables us to make a subset of the system for a small stand-alone system, SASH. SASH is compatible with DASH and one can continuously reduce and analyze data between DASH and SASH.
Analysis of multispectral radiometric signatures from geosynchronous satellites
Tamara E.W. Payne, Stephen A. Gregory, Nina M. Houtkooper, et al.
The Air Force Research Laboratory Directed Energy Directorate has collected and analyzed passive Multispectral radiometric data using two different sets of filters: astronomical broad-band Johnson filters and the Space Object Identification In Living Color (SILC) filters for Space Situational Awareness (SSA) of geosynchronous satellites (GEOs). The latter set of filters was designed as part of the SILC Space Battlelab initiative. The radiometric data of geosynchronous satellites were taken using a charge-coupled device (CCD) on the 24-inch Ritchey-Chretien telescope at Capilla Peak Observatory of the University of New Mexico. The target list is comprised of satellites with similar and dissimilar bus structures. Additionally, some of the satellites are in a cluster. The results presented will show the advances in classifying GEOs by their bus type and a resolution scenario of cluster cross tagging using Multispectral radiometric measurements.
Physical and Statistical Modeling of Saturn's Troposphere
Padmavati A. Yanamandra-Fisher, Amy J. Braverman, Glenn S. Orton
The 5.2-μm atmospheric window on Saturn is dominated by thermal radiation and weak gaseous absorption, with a 20% contribution from sunlight reflected from clouds. The striking variability displayed by Saturn's clouds at 5.2 μm and the detection of PH3 (an atmospheric tracer) variability near or below the 2-bar level and possibly at lower pressures provide salient constraints on the dynamical organization of Saturn's atmosphere by constraining the strength of vertical motions at two levels across the disk. We analyse the 5.2-μm spectra of Saturn by utilising two independent methods: (a) physical models based on the relevant atmospheric parameters and (b) statistical analysis, based on principal components analysis (PCA), to determine the influence of the variation of phosphine and the opacity of clouds deep within Saturn's atmosphere to understand the dynamics in its atmosphere.
Real-time detection of optical transients with RAPTOR
Konstantin N. Borozdin, Steven P. Brumby, Mark C. Galassi, et al.
Fast variability of optical objects is an interesting though poorly explored subject in modern astronomy. Real-time data processing and identification of transient celestial events in the images is very important for such study as it allows rapid follow-up with more sensitive instruments. We discuss an approach which we have developed for the RAPTOR project, a pioneering closed-loop system combining real-time transient detection with rapid follow-up. RAPTOR's data processing pipeline is able to identify and localize an optical transient within seconds after the observation. The testing we performed so far have been confirming the effectiveness of our method for the optical transient detection. The software pipeline we have developed for RAPTOR can easily be applied to the data from other experiments.
Multiresolution filtering and segmentation of multispectral images
Fionn D. Murtagh, Christophe Collet, Mireille Louys, et al.
We consider multiple resolution methods for filtering and segmenting multispectral astronomical images. For filtering, we use noise modeling, wavelet transform, and the Karhunen-Loeve transform. For segmentation, we use a quadtree followed by the fitting of a Markov model. We illustrate these methods on Hubble Space Telescope near infrared NICMOS camera images.
Data Mining: Information Retrieval
icon_mobile_dropdown
Statistical methodology for massive datasets and model selection
G. Jogesh Babu, James P. McDermott
Astronomy is facing a revolution in data collection, storage, analysis, and interpretation of large datasets. The data volumes here are several orders of magnitude larger than what astronomers and statisticians are used to dealing with, and the old methods simply do not work. The National Virtual Observatory (NVO) initiative has recently emerged in recognition of this need and to federate numerous large digital sky archives, both ground based and space based, and develop tools to explore and understand these vast volumes of data. In this paper, we address some of the critically important statistical challenges raised by the NVO. In particular a low-storage, single-pass, sequential method for simultaneous estimation of multiple quantiles for massive datasets will be presented. Density estimation based on this procedure and a multivariate extension will also be discussed. The NVO also requires statistical tools to analyze moderate size databases. Model selection is an important issue for many astrophysical databases. We present a simple likelihood based 'leave one out' method to select the best among the several possible alternatives. The performance of the method is compared to those based on Akaike Information Criterion and Bayesian Information Criterion.
Second-order bibliometric operators in the Astrophysics Data System
Michael J. Kurtz, Guenther Eichhorn, Alberto Accomazzi, et al.
Second order relational operators are functions which take lists which have been generated by a database query, and from those lists form sets of other lists, which can then be merged and sorted on the basis of one or more of the attributes of the items in the lists. The NASA Astrophysics Data System is unique among bibliometric information retrieval systems in the degree to which users are permitted to make use of these concepts. Given a knowledge of how the second order operators work, ADS users can create complex logical algebras which facilitate the discovery of very highly specific information.
Knowledge discovery in bibliographic collections using concept hierarchies and visualization tools: application to the astronomy domain
Josiane Mothe, Daniel Egret, Claude Chrisment, et al.
This paper presents new methods for knowledge extraction and visualization, applied to datasets selected from the astronomical literature. One of the objectives is to detect correlations between concepts extracted from the documents. Concepts are generally meta-information which may be defined a priori, or may be extracted from the document contents and are organised along domain ontologies or concept hierarchies. The study illustrated in the paper uses a data collection of about 10,000 articles extracted from the NASA ADS, corresponding to all publications for which at least one author is a French astronomer, for the years 1996 to 2000. The study presents new approaches for visualizing relationships between institutes, co-authorships, scientific domains, astronomical object types, etc.
Poster Session
icon_mobile_dropdown
New automated classification technique of galaxy spectra with Z<1.2 based on PCA-ODP
In this paper, we investigate the Principal Component Analysis-Optimal Discrimination Plane (PCA-ODP) approach on a data set of galaxy spectra including eleven standard subtypes with the redshift value ranging from 0 to 1.2 and a span of 0.001. These eleven subtypes are E, S0, Sa, Sb, Sc, SB1, SB2, SB3, SB4, SB5, SB6, respectively, according to the Hubble sequence. Among them, the first four subtypes belong to the class of normal galaxies (NGs); the remaining seven belong to active galaxies (AGs). We apply the PCA approach to extract the features of galaxy spectra, project the samples onto the PCs, and investigate the ODP method on the data of feature space to find the optimal discrimination plane of the two main classes. ODP approach was developed from Fisher's linear discriminant method. The difference between them is that Fisher's method uses only one Fisher's vector and ODP uses two orthogonal vectors including Fisher's vector and another. Besides the data set above, we also use the Sloan Digital Sky Survey (SDSS) galaxy spectra and Kennicutt (1992) galaxy data to test the ODP classifier. The experiment results show that our proposed technique is both robust and efficient. The correct rate can reach as high as 99.95% for the first group data, 96% for SDSS data and 98% for Kennicutt data.
Classification of AGNs from stars and normal galaxies by surport vector machines
In order to explore the spectral energy distribution of various objects in a multidimensional parameter space, the multiwavelenghth data of quasars, BL Lacs, active galaxies, stars and normal galaxies are obtained by positional cross-identification, which are from optical(USNO A-2), X-ray(ROSAT), infrared(2MASS) bands. Different classes of X-ray emitters populate distinct regions of a multidimensional parameter space. In this paper, an automatic classification technique called Support Vector Machines(SVMs) is put forward to classify them using 7 parameters and 10 parameters. Finally the results show SVMs is an effective method to separate AGNs from stars and normal galaxies with data from optical, X-ray bands and with data from optical, X-ray, infrared bands. Furthermore, we conclude that to classify objects is influenced not only by the method, but also by the chosen wavelengths. Moreover it is evident that the more wavelengths we choose, the higher the accuracy is.
Steps toward the development of an automatic classifier for astronomical sources
Carole Thiebaut, Michel Boer, Sylvie Roques
We present the progress we have made in implementing a new kind of automatic classifier for astronomical objects. The developed classifier will work both in the image and time domain and take into account the geometrical and temporal characteristics of the sources. We have first constructed a 2D classifier which is based on a Self Organizing Map. The developed network is able to learn through experience and to discriminate between astronomical objects such as stars, galaxies, saturated objects or blended objects. In order to recognize and classify variable objects, the method had to be improved. We present the next step of classification through our 3D (geometry - time) classifier. The temporal characteristics of the sources are obtained by different analysis of their light curves: time domain, frequency and time-frequency analysis. We add the geometrical and temporal characteristics to obtain a complete classification of the sources. We plan to use the difference image analysis to obtain block of images and analyze them directly through the classifier. Such a complete classification has not yet been realized in the astronomical domain. In general our method works better than other automatic methods and allows a more complete discrimination through astronomical sources.
Bayesian model selection for spatial clustering in 3D surveys
Fionn D. Murtagh, C. Donalek, Giuseppe Longo, et al.
In this preliminary work on galaxy clustering, we study clusters of arbitrary shape using a 3D galaxy catalog (celestial coordinates and photometric redshifts) derived from Sloan Digital Sky Survey Early Release Data. Spatial influence is modeled using a Markov random field. Comparative model assessment is carried out using an approximation to Bayes factors, viz. the posterior odds of a hypothesis of a given number of clusters versus another. We conclude with a discussion of promising future research directions.
Pipeline: Software and System
icon_mobile_dropdown
Analysis and visualization of multiwavelength spectral energy distributions in the NASA/IPAC extragalactic database (NED)
Joseph M. Mazzarella, Barry F. Madore, Judy Bennett, et al.
The NASA/IPAC Extragalactic Database (NED,http://ned.ipac.caltech.edu/) currently contains over 4.5 million photometric measurements covering the electromagnetic spectrum from gamma rays through radio wavelengths for objects that are being cross-correlated among major sky surveys (e.g., SDSS, 2MASS, IRAS, NVSS, FIRST) and thousands of smaller, but unique and important, catalogs and journal articles. The ability to retrieve photometric data (including uncertainties, aperture information, and references) and display spectral energy distributions (SEDs) for individual objects has been available in NED for six years. In this paper we summarize recent enhancements that enable construction of large panchromatic data sets to facilitate multi-dimensional photometric analysis. The database can now be queried for samples of objects that meet flux constraints at any wavelength(e.g., objects with any available 20cm flux, or objects with fν10μm] > 5.0Jy). The ability to utilize criteria involving flux ratios (e.g., objects with fν[20cm]/fν[60μm] > 0.5) is under development. Such queries can be jointly combined with additional constraints on sky area, redshifts, object types, or sample membership, and the data are output with consistent physical units required for comparative analysis. Some results derived from fused photometric data in NED are presented to highlight the large number and diversity of available SEDs.
Data mining of large astronomical databases with neural tools
Giuseppe Longo, Ciro Donalek, Giancarlo Raiconi, et al.
The International Virtual Observatory will pose unprecedented problems to data mining. We shortly discuss the effectiveness of neural networks as aids to the decisional process of the astronomer, and present the AstroMining Package. This package was written in Matlab and C++ and provides an user friendly interactive platform for various data mining tasks. Two applications are also shortly outlined: the derivation of photometric redshifts for a subsample of objects extracted from the Sloan Digital Sky Survey Early Data Release, and the evaluation of systematic patterns in the telemetry data for the Telescopio Nazionale Galilo (TNG).
Poster Session
icon_mobile_dropdown
Data reduction pipeline for EMIR: a near-IR multi-object spectrograph for the Spainish 10-m telescope
Jesus Gallego, Nicolas Cardiel, Angel Serrano, et al.
EMIR (Espectrografo Multiobjeto Infrarrojo) is a near-infrared wide-field camera and multi-object spectrograph to be built for the 10.4m Spanish telescope (Gran Telescopio Canarias, GTC) at La Palma. The Data Reduction Pipeline (DRP), which is being designed and built by the EMIR Universidad Complutense de Madrid group, will be optimized for handling and reducing near-infrared data acquired with EMIR. Both reduced data and associated error frames will be delivered to the end-users as a final product.
Specview: a Java tool for spectral visualization and model fitting of multi-instrument data
Specview is a spectral visualization tool designed to provide easy simultaneous display and analysis of multiple 1-D spectrograms of the same astronomical source taken with different instruments. It is a standalone Java application that features, aside its main plotting capabilities, a spectral model fitting engine. This article describes its main features and some aspects of its internal design. The software can be downloaded from http://specview.stsci.edu.
AIPS++ and the GBT: a layered approach to processing and analysis of single-dish data
James Braatz, Joseph McMullin, Robert Garwood, et al.
The Green Bank Telescope (GBT) is a new 100-m diameter antenna with an unblocked aperture and an active surface. It is designed to observe at frequencies from 300 MHz to 100 GHz, and includes state of the art continuum and spectral backends. The GBT is also capable of pulsar work and recording as a VLB station, and array receivers are being developed as well. AIPS++ is the integral software package for analysis of GBT data both for scientific analysis as well as for control and engineering analysis of the component systems. In this paper we will give an overview of how the AIPS++ system is used with the GBT, with special consideration to the development of spectral analysis software. AIPS++ allows a layered approach to software development, and the spectral analysis capability gives a strong example of the usefulness of the layered approach. At the heart of AIPS++ is a suite of tools which are capable of astronomy-specific calculations as well as general purpose mathematical analysis, data visualization, GUI development, and scripting. A tool for analyzing single-dish data, DISH, is developed on this platform. DISH includes a number of modern features such as bulk processing of datasets and versatile GUI interaction. A simplified package using a familiar CLI, known as UNI-jr, is built on DISH and is available as an easy to learn path for processing scan-based data. Finally, the Interim Automated Reduction and Display System (IARDS) is built on UNI-jr and provides an automated reduction package and pseudo real-time display.
AQUA: a very fast automatic reduction pipeline for near-real-time GRB early afterglow detection
AQUA (Automated QUick Analysis) is the fast reduction pipeline of the Near Infra-Red (NIR) images obtained by the REM telescope. REM (Rapid Eye Mount) is a robotic NIR/Optical 60cm telescope for fast detection of early afterglow of Gamma Ray Bursts (GRB). NIR observations of GRBs early afterglow are of crucial importance for GRBs science, revealing even optical obscured or high redshift events. On the technical side, they pose a series of problems: luminous background, bright sources (as the counterparts should be observed few seconds after the satellite trigger) and fast detection force high rate images acquisition. Even if the observational strategy will change during the same event observation depending on the counterpart characteristics, we will start with 1 second exposures at the fastest possible rate. The main guideline in the AQUA pipeline development is to allow such a data rate along all the night with nearly real-time results delivery. AQUA will start from the raw images and will deliver an alert with coordinates, photometry and colors to larger telescopes to allow prompt spectroscopic and polarimetric observations. Very fast processing for the raw 512×512 32bit images and variable sources detection with both sources catalogs and images comparison have been implemented to obtain a processing speed of at least 1 image/sec. AQUA is based on ANSI-C code optimized to run on a dual Athlon Linux PC with careful MMX and SSE instructions utilization.
ISO long-wavelength spectrometer interactive analysis system
Tanya L. Lim, Gerard Hutchinson, Sunil D. Sidher, et al.
The Long Wavelength Spectrometer was one of two complementary spectrometers on the Infrared Space Observatory (ISO). The LWS operated between 44 and 197 microns either in medium resolution mode, using a diffraction grating, or in high resolution mode where a Fabry Perot was also placed in the beam. All LWS data is processed through a standard 'pipeline' and this standard processing is adequate for the majority of the data. However, as the understanding of instrument behaviour increased during operations, it became apparent that various modes, in particular the Fabry Perot scanning mode, could be better processed in an interactive manner. A complementary LWS Interactive Analysis (LIA) system was then developed for the scientific part of the data processing. The LIA system allows users to access the processing steps, in terms of visualisation of intermediate products and for interactive manipulation of the data at each stage. This paper describes the LIA system and details in terms of instrument behaviour those data sets that require interactive processing. Following the release of the ISO Legacy Archive in 2001, LIA 10 was released in November 2001. A further update, LIA 10.1, will be released later this year and we outline the new features along with some of the most commonly used LIA routines.
New image compression capabilities in CFITSIO
The previously proposed tiled-image compression scheme is now fully supported when reading and writing FITS images using the CFITSIO subroutine library. This scheme generally produces image compression factors that are superior to the standard gzip or unix compress algorithms, especially when compressing floating point data type images. In addition to reducing the required amount of disk space to store the image, this compression technique often makes applications programs run faster because of the reduced amount of magnetic disk I/O that is required to read or write the image.
Data reduction pipeline for OSIRIS, the new NIR diffraction limited imaging field spectrometer for the Keck adaptive optics system
Alfred Krabbe, Thomas M. Gasaway, Jason Weiss, et al.
OSIRIS is a near infrared diffraction limited imaging field spectrometer under development for the Keck observatory adaptive optics system. Based upon lenslet pupil imaging, diffraction grating, and a 2K×2K Hawaii2 HgCdTe array, OSIRIS is a highly efficient instrument at the forefront of today’s technology. OSIRIS will deliver per readout up to 4096 diffraction limited spectra in a complex interleaved format, requiring new challenges to be met regarding user interaction and data reduction. A data reduction software package is under development, aiming to provide the observer with a facility instrument allowing him to concentrate on science rather than dealing with instrumental as well as telescope and atmosphere related effects. Together with OSIRIS, a pipeline for basic data reduction will be provided for a new Keck instrument for the first time. Some aspects of the data reduction pipeline will be presented here. The OSIRIS instrument as such, the astronomical background as well as other software tools were presented elsewhere on this conference.
Sloan digital sky survey 1D spectroscopic pipeline
Mark SubbaRao, Josh Frieman, Mariangela Bernardi, et al.
The Sloan Digital Sky Survey(SDSS) is a redshift factory producing more than 5000 spectra a night when conditions allow. In order not to fall behind the data acquisition, highly automated and accurate spectral reduction pipelines are needed to process the data. Also the varied nature of SDSS target selection: the main galaxy sample, the large red galaxy sample, the QSO sample, stellar targets, and serendipitous objects; requires the pipeline to be adept at handling the full range of astronomical spectra, from a redshift of 0 out to a redshift of 6. The SDSS spectra are of exceptionally high quality, allowing the pipeline not only to determine redshifts and broadly classify the spectra, but also measure line parameters, calculate spectral indices, velocity dispersions etc. The purpose of this summary is to give an overview of the workings of the pipeline and its outputs, and give some guidance in using those outputs.
Cosmology, CMB
icon_mobile_dropdown
Spatial clustering of galaxies in large datasets
Alexander Szalay, Tamas Budavari, Andrew Connolly, et al.
Datasets with tens of millions of galaxies present new challenges for the analysis of spatial clustering. We have built a framework, that integrates a database of object catalogs, tools for creating masks of bad regions, and a fast (NlogN) correlation code. This system has enabled unprecedented efficiency in carrying out the analysis of galaxy clustering in the SDSS catalog. A similar approach is used to compute the three-dimensional spatial clustering of galaxies on very large scales. We describe our strategy to estimate the effect of photometric errors using a database. We discuss our efforts as an early example of data-intensive science. While it would have been possible to get these results without the framework we describe, it will be infeasible to perform these computations on the future huge datasets without using this framework.