The XMM-Newton Serendipitous Source Catalogue:1XMM

User Guide to the Catalogue


Release 1.3 28 January 2004 Associated with Catalogue version 1.0.1

Prepared by the XMM-Newton Survey Science Centre Consortium (http://xmmssc-www.star.le.ac.uk/)

This User Guide refers directly to the full FITS and plain-text formats of the catalogue. Most of the content is also applicable to, and useful for understanding, the database-served formats (from XSA, XCAT-DB, LEDAS), but there may be some aspects that are not applicable to the latter. Conversely, there are aspects of the database implementations that are not covered by this User Guide, but by the Help information provided from the database user interfaces.

The User Guide provides a detailed account of the production and contents of the catalogue. Users interested in the main properties of the catalogue will find the summary and sections 1 & 7 of most immediate interest, but we encourage all users also to look at section 6 which highlights some important caveats.

Contents


Summary

1XMM is the first comprehensive catalogue of serendipitous X-ray sources from the European Space Agency's (ESA) XMM-Newton observatory, and has been constructed by the XMM-Newton Survey Science Centre (SSC) on behalf of ESA. Most (> 80%) of the entries have not previously been reported as X-ray sources.

The catalogue contains source detections drawn from 585 XMM-Newton EPIC observations made between 2000 March 1 and 2002 May 5; all datasets were publicly available by 2003 January 31 but not all public observations are included in this catalogue. Net exposure times in these observations range from < 1000 up to  ~ 100000 seconds. The total area of the catalogue fields is  ~ 90 deg2, but taking account of the substantial overlaps between observations, the net sky area covered independently is  ~ 50 deg2. The observations sample, albeit sparsely, most of the sky, with the exception of a 'hole' centered in the Cygnus region, caused by spacecraft observing constraints.

The processing to generate the catalogue is based closely on the data processing system used by the SSC in routine processing of XMM-Newton data on behalf of ESA to produce data products for observers and the archive. Considerable effort has been expended to 'calibrate' the catalogue in terms of understanding such issues as errors, biases, sensitivity and sky coverage.

The catalogue source detection and parametrization technique is optimized for point-like sources, and has been performed across several photon-energy bands and using data from each of the three EPIC cameras - PN, MOS-1, MOS-2. As a result of this optimization, sources that are significantly extended will have underestimated fluxes.

The catalogue has  ~ 400 columns; these include source-detection parameters (likelihood, position coordinates, counts, count rate, flux, hardness ratio, background estimates, errors etc), the results of cross-correlation with a large number of astronomical catalogues (SIMBAD, NED, USNO, GSC, APM, ROSAT etc), quality 'flags' resulting from visual screening, and 'meta-data' relating to the observation.

The expected number of spurious detections as a function of likelihood has been estimated by simulation to be  ~ 10, 5, 2, 1 per EPIC exposure (i.e., for a single camera) at likelihood thresholds of 6, 8, 10, 12.

As a quality control, each field has been visually screened. Where problems (i.e. deficiencies in the automatic processing) were identified, the detections affected were 'flagged' accordingly. The summary detection flag has values from 4 (best) to 0 (worst).

The catalogue contains 33026 X-ray source detections with likelihood values > 8 and summary quality flag > 0 together with a further 23685 detections with lower likelihood values and/or summary quality flag = 0. These latter sources have lower reliability, since those with quality flag = 0 are deemed to be false detections due to processing deficiencies, whilst at likelihoods below 8 the fraction of spurious sources expected on a statistical basis increases rapidly. The 33026 X-ray source detections relate to 28279 unique X-ray sources.

The median flux (in the total photon-energy band 0.2 -12 keV) of the catalogue sources is  ~ 3*10-14 erg/cm2/s;  ~ 12% have fluxes below 1*10-14 erg/cm2/s.

The positional accuracy of the catalogue sources is generally  < 2 arcsec (68% confidence radius) for detections with likelihood  > 8. The flux estimates from the three EPIC cameras are overall in agreement to  ~ 2% for on-axis sources, and  ~ 6% off-axis.

In association with the catalogue itself, various data products are also made available (images, exposure maps, sensitivity maps, extracts from astronomical catalogues and databases etc).

The catalogue is available in several forms and from several servers:

XSA, XCAT-DB and LEDAS provide a Web-based user interface allowing filtering and searching of the catalogue, and links to associated data products. The SSC Home Page, XSA and VizieR allow download of the catalogue file in (binary) FITS. The SSC Home Page also provides a plain ASCII-text version.

Known problems and open issues are discussed in Sec. 6

1. Introduction

Pointed observations with the XMM-Newton Observatory detect significant numbers of previously unknown 'serendipitous' X-ray sources in addition to the original target. Combining the data from many observations thus yields a serendipitous point source catalogue which, by virtue of the large field of view of XMM-Newton and its high sensitivity, represents a significant resource. The serendipitous point source catalogue enhances our knowledge of the X-ray sky and has the potential for advancing our understanding of the nature of various Galactic and extragalactic source populations.

The first XMM-Newton catalogue contains X-ray source detections from  ~ 600 XMM-Newton observations made between launch in 2000 March 1 and 2002 May 5 which are in the public domain. The production of this catalogue has been undertaken by the XMM-Newton Survey Science Centre (SSC) consortium in fulfillment of one of its major responsibilities within the XMM-Newton project. The catalogue production process has been designed to exploit fully the capabilities of the XMM-Newton EPIC cameras and to ensure the integrity and quality of the resultant catalogue through rigorous screening of the data.

2. Data selection

2.1 Selection of XMM-Newton observations

The original selection of XMM-Newton observations for processing in the catalogue pipeline was based on the desire to include a sufficient number of early observations to produce a statistically useful catalogue population and original expectations of the public release date of the data and the likely catalogue release date. From the observations initially selected, stage 1 screening (Sec. 3.3) resulted in the rejection of ~30% of the originally ingested observations, whilst issues raised in stage 2 screening, together with filtering against a public release date of 2003 January 31, reduced the total number of observations by a further ~10%. As a result of these different selection considerations, the observations finally included in the catalogue do not correspond to, e.g., a uniform set of XMM-Newton revolutions or a specified time interval.

Table 2.1 gives the list of the final 585 observations which are included in the catalogue.

2.2 Selection of exposures

Most, but not all, XMM-Newton observations involve a single exposure with each of the three EPIC cameras. (A significant number of observations involve multiple exposures and/or do not include exposures with one or more of the three cameras.)  For each observation we selected exposures for each of the three EPIC cameras for processing using the following criteria:

(i) At least one of the EPIC cameras exposures must be > 1000 seconds duration.
(ii) The exposure must have been taken through a scientifically useful filter. In practice this rejected all exposures for which the filter position was open, closed, calibration or undefined. The possible filters used in the observations selected for the catalogue are Medium , Thick, Thin1, and Thin2 (PN only). For a detailed description of the filters see in the XMM user hand book 3.3.6 EPIC filters and effective area.
(iii) The exposure must have been taken in a mode which could usefully be processed by the detection stage. PN small window modes were rejected since the effective field of view in these modes is small making the background fitting stage of the source detection problematic. For the MOS nearly all modes, including those modes in which the area of the central CCD was windowed or missing (e.g., timing modes, here 'Fast Uncompressed') but excluding refreshed frame store mode, were included. The possible observing modes used in the observations selected for the catalogue are given in Tab. 2.2. For a detailed description of the modes see in the XMM user hand book 3.3.2 Science modes of the EPIC cameras.
(iv) Background filtering (see Sec. 3.2.2) must have been successfully applied. This was not the case when the sum of all Good Time Intervals (hereafter GTIs) was less than 1000 seconds. Without background filtering the source detection is typically of limited value due to the much higher net background.

Where more than one exposure within an observation for a particular camera met the above selection conditions, only the exposure with maximum duration was chosen.

3. Data processing: Catalogue production process

3.1 Major components of the catalogue production process

The processing of the catalogue observations was facilitated through a pipeline configuration similar to that used for the routine, production processing of observations but using a limited subset of pipeline control modules and SAS tasks associated with processing data from the EPIC cameras.

After creation of the pipeline data products (Sec. 3.2) a two-stage visual screening process was conducted (Sec. 3.3.1 ). The first stage identified 'clean' fields (=observations) suitable for catalogue inclusion and the second stage provided quality flagging for each individual source detected in each EPIC camera.

Source lists were produced for one exposure (see Sec. 2.2) from each EPIC camera and these were merged into an observation-level source list as part of the pipeline processing of each observation. These observation-level source lists were later merged into a single FITS-format catalogue file. The catalogue file was constructed to contain key columns from the original observation-level source lists plus additional columns derived from both final and intermediate pipeline data products. Some of these additional columns were included to provide observation-level meta-data for each source and others were included primarily for diagnostic purposes (Secs 4.3 and 4.4).

Throughout the documentation references to catalogue columns are marked in green. The prefix 'xx_' in a column name indicates a wildcard for any of the three EPIC cameras, i.e. with 'xx' replaced by 'PN,' 'M1' or 'M2'. Some of the column names also include an energy band identifier ('_N', where N=1,2,3,4,5,(6,7),8,9) which is typically not explicitly indicated.

3.2 Pipeline Processing

3.2.1 Relationship to routine processing

The selected observations were processed in a catalogue-specific version of the pipeline. A subset of the modules and tasks were configured to perform the EPIC event processing and filtering, source searching and rectification and catalogue cross correlation stages. Processing of data from the other XMM-Newton instruments was not carried out in the catalogue-specific version of the pipeline.

Processing in the routine pipeline normally occurs using a continuously improving set of calibration files (the so-called CCF, Calibration Constituents File) so that processing of any particular observation always uses the best calibration available at the time. In contrast, for reasons of uniformity the CCF used for the catalogue processing was frozen at the start of the catalogue production process. The individual CCFs which were used to calibrate the MOS and PN events may be derived by looking in the Calibration Index File (CIF) extension in each of the relevant event list files (see also appendix A.3).

Initially the catalogue observations were processed through to completion using a configuration-controlled version of the SAS and a reduced set of pipeline modules. The version numbers, which can be found stamped in the data products, were:

PPSVERS = '05000034/20020625.161738' / PPS configuration
SASVERS = 'xmmsas_20020522_1701' / SAS version

The general configuration of the pipeline is described on the SSC web page http://xmmssc-www.star.le.ac.uk/newpages/pipe_top_ext.html#config. The specific configuration details of the catalogue pipeline may be found in appendix A.4.

Following an initial appraisal of the source detection stage, two modifications were made to this configuration, and the source detection stage was rerun. These modifications were:

(i) The version of the maximum likelihood source detection SAS task emldetect v4.11.13 was replaced by emldetect v4.21
(ii) A call to the background map spline fitting SAS task esplinemap which had the source cut out radius parameter (scut) set to the value 0.01 was replaced by a call to esplinemap with the parameter set to the value 0.001.

3.2.2 Processing steps - event calibration and filtering

The following sections describe the individual steps taken within the processing chain leading to images on which source detection could be performed.

a) The processing of a MOS exposure

1. A first pass constructs a flare lightcurve selecting events with the (XMMEA_22 = REJECT_BY_GATTI & XMMEA_EM = GOOD_MOS_EVENTS) flags set

2. GTIs are made by filtering the flare lightcurve using the MOS flare threshold. These GTIs are used to define the time regions in which bad pixel searching occurs.

3. All GTIs with a duration of less than 100 seconds are excluded.

4. The SAS task embadpixfind is used to locate dark pixels in each MOS CCD (using events which have not been filtered through the flare GTIs).

5. If no flare GTIs were made the SAS task embadpixfind is used to locate bright pixels

6. If flare GTIs do exist the events are filtered through the flare GTIs and then embadpixfind is used to locate bright pixels.

7. The SAS task badpix is run on each CCD event file in order to add a bad pixel extension.

8. The intervals in the global GTI file are aligned with the event list and merged with the CCD GTIs.

9. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

10. Raw event pulse height values are converted to rectified event energies.

11. Unwanted events are filtered out before lists are merged.

12. The per-CCD event lists are merged into one per camera.

13. Filter the good imaging events into final event lists.

14. Copy the CIF into a separate extension in the event list.

15. Make a second pass flare lightcurve selecting events with the (XMMEA_22 = REJECT_BY_GATTI & XMMEA_EM = GOOD_MOS_EVENTS) flags set and bad pixels excluded.

16. Create flare GTIs using the MOS flare threshold.

17. Filter the event files through GTIs into final event files.

b) The processing of a PN exposure

1. The SAS task badpixfind is run to create a mask of non-source pixels to be used in generating a background flare lightcurve

2. Badpixfind is run on each CCD to locate bright and dead pixels

3. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

4. Raw event pulse height values are converted to rectified event energies.

5. Filter events by selecting events with the (XMMEA_EP = PN_GOOD_EVENTS) flags set

6. Filter the CCD event files on the HK GTIs and merge into one

7. Copy the CIF into a separate extension in the event list.

8. Make a background flare lightcurve using events with energies between 7 keV & 15 keV events and excluding bad pixels by using the previously created pixel mask.

9. Create flare GTIs using the PN flare threshold for use in later processing stages. 

c) Background filtering

The MOS flare lightcurves were produced from GATTI-flagged (essentially events with energies above 14 keV), single-pixel events from the outer CCDs. After binning the lightcurve, the flare GTIs were selected by imposing a rate threshold of 2 cnts/arcmin2/ksec.

The PN flare lightcurves were produced in the 7.0 - 15 keV energy range. After binning the lightcurve, the flare GTIs were selected by imposing a rate threshold of 10 cnts/arcmin2/ksec.

d) Image creation

1. Exclude from the flare GTIs all intervals with duration less than 100 seconds.

2. For each energy band make a counts image. The images are 600 x 600 pixels with 4-arcsecond pixel sides. The images are tangent plane projections of celestial coordinates. The definitions of all the energy bands used are given in Tab. 3.1 below.


Table 3.1: Energy bands used in XMM EPIC observations
Basic energy bands: 1 = 0.2 -   0.5 keV  
2 = 0.5 -   2.0 keV
3 = 2.0 -   4.5 keV
4 = 4.5 -   7.5 keV
5 = 7.5 - 12.0 keV
Broad energy bands: 6 = 0.2 -   2.0 keV   = soft band, no images made
7 = 2.0 - 12.0 keV   = hard band, no images made
8 = 0.2 - 12.0 keV   = total band
9 = 0.5 -   4.5 keV   = XID band

3. Event selection for PN images is PATTERN <= 4 and RAWY > 12 with events ON_OFFSET_COLUMN excluded. Band 1 images have the additional stricter requirement PATTERN = 0, while band 8 images have PATTERN = 0 below 0.5 keV.

4. For MOS band 1 - 5 images no PATTERN selection is made beyond the 0 - 25 selection made in the event lists.

5. Make exposure images corresponding to bands 1 - 5 count images

3.2.3 Source Detection

For the production of the catalogue, source detection was performed separately on the data of the three EPIC cameras. The sources detected in each of the three cameras were then merged into a common source list. For each camera, source detection was performed simultaneously on the 5 images in the energy bands 1 - 5. In addition, source parameters for the XID band were determined simultaneously in the energy bands 2 and 3 (cf. Tab. 3.1). The total source parameters (that is, for band 8) were combined from the results in bands 1 - 5. Note that the catalogue source detection and parametrization technique is optimized for point-like sources, hence sources that are significantly extended will have underestimated fluxes. The detection pipeline consists of the following processing steps.

a) Exposure map calculation

The SAS task eexpmap was used to calculate exposure maps for each image. The exposure is determined from good time intervals (GTIs) and is corrected for vignetting, quantum efficiency, and bad pixels or columns. For the projection from detector to sky coordinates the attitude history file is used.

b) Creation of detection masks

The detection masks define the areas of the images which are suitable for source detection. For the catalogue processing, all image pixels, where the exposure was at least 5% of the on-axis value, were included in the detection masks.

c) Local source detection

The SAS task eboxdetect was used to create a preliminary source list. At this stage eboxdetect performs a sliding box search using a local background determined in a frame around the search box. The box size was set to 5 x 5 pixels and the likelihood threshold was 8.0. The resulting source lists were only used as input for the subsequent background determination.

d) Creation of background maps

The local box detection list is used by the SAS task esplinemap to blank out the positions of detected X-ray sources in the input image of each energy band. The resulting image was then used as input for a spline fit to calculate a smoothed background map for the entire image. The background maps for the catalogue were calculated using a 16 x 16 nodes spline fit.

e) Map source detection

A second pass of eboxdetect creates a sliding box source list using this time the background maps generated by esplinemap. As in step 3, the box size was set to 5 x 5 pixels, and the detection likelihood threshold was set to 8.0 for the catalogue pipeline.

f) Parameter estimation by PSF fitting

The sources detected in step 5 are passed on to the SAS task emldetect. This task determines the source parameters by fitting the instrumental point spread function (PSF) simultaneously to the subimages created for each input source position in the five standard energy bands. The size of the circular subimages (xx_CUTRAD) was chosen such that in the hardest band (band 5) 68% of the flux of a point source is contained within that image extraction radius (xx_EEF, see Cols 245, 251 and 257). The source position was fit simultaneously on all input images, whereas the source count rates are left to vary independently in each image. No source extent was fitted, i.e. all sources were treated as point sources.

Emldetect uses the exposure maps (step 1) to correct the count rates for vignetting and losses due to inter-chip gaps and bad pixels/columns. All EPIC PN count rates were also corrected for losses due to events arriving during readout times (out of time events), that is,

count_rate = source_counts  / (exp_map * oot_factor)

with the oot_factor = 0.9411 for PN PrimeFullFrame modes, oot_factor = 0.97815 for PN ExtendedFullFrame modes and oot_factor = 1.0 for all other PN and M1/M2 modes (xx_SUBMODE).

Source fluxes were calculated from count rates based on energy conversion factors assuming a spectral model of an absorbed power-law with

Nh = 3.0 * 1020 cm2

and slope = 1.7  (see CAL-TN-0023-v2.0.ps)

Note that all count rates and fluxes correspond to the flux in the entire PSF and do not need any further corrections for PSF losses.

The detection likelihood values (xx_DET_ML) are based on the likelihood ratio described by Cash (1979). Depending on the number of input images, corrections to the detection likelihoods were applied, which take the number of free fit parameters into account. The output DET_ML values approximately obey the relation ML = -ln(P), where P is the probability to find a spurious source with the likelihood value ML at a given position.

The resulting parameters of the 5 individual bands were combined into band 8 (total) values. The XID parameters are the combined results obtained separately from a simultaneous fit in the bands 2 and 3 where the positions are identical with the ones derived for the bands 1 - 5.

Hardness ratios of the form HR =  (CR2 - CR1) / (CR1 + CR2) were calculated for the combinations of energy bands 1 & 2 (xx_HR1), 2 & 3 (xx_HR2), and 3 & 4 (xx_HR3), with CR being the count rates. In comparing, in particular, the HR1 hardness ratios, the filters used in the individual observations have to be considered. Note, that a large fraction of the hardness ratios are calculated from marginal or non-detections in at least one of the energy bands. Individual hardness ratios should only be deemed reliable if the source was above the detection likelihood threshold in both energy bands, or else have to be treated as upper limits if the source was only detected in one of the bands.

References:

Cash, W., 1979, Parameter estimation in Astronomy through application of the likelihood ratio,
ApJ, 228, p. 939

3.2.4 Merging of EPIC source lists

The emldetect output source lists from the individual EPIC cameras are merged into a common list by the SAS task srcmatch. One row per detection (in any camera) is written to the output table, while detections that have coinciding positions in the different camera lists are merged. In the catalogue pipeline, two detections were merged if

DIST < 4 * SQRT(radec_err12 + radec_err22 + systerr2)

where DIST is the distance between the two detections, radec_err(i) is the statistical position error of the individual camera-specific detection i (output from emldetect), and systerr is 4 arcsec.

Srcmatch determines total EPIC count rates and global detection likelihoods (EP_DET_ML) for the merged sources by adding the respective values from the individual instruments. Average values for fluxes, hardness ratios and source positions are calculated as well.

3.2.5 Position rectification

The SAS task eposcorr (v3.4.2) correlates the X-ray positions from an observation (output of srcmatch) with optical positions and minimizes the positional offsets by applying a translation and rotation to the X-ray positions. For the catalogue pipeline the merged source lists from each observations were correlated with the USNO A2.0 catalogue, and a search radius of 15 arcsec was applied.

The SAS task evalcorr evaluates the quality of the position rectification of eposcorr. The results from a large sample of fields were used to establish:

(i) the threshold for accepting the refined astrometric solution:

POSCOROK is set to True if  r < 6 arcsec AND EPOSCORR_LIKH > 5, with

r = SQRT( raoffset2 + decoffset2 )

where raoffset and decoffset are the mean shift in RA and Dec, respectively, determined by eposcorr for a given observation. If POSCOROK is set to True the columns RA_CORR and DEC_CORR give the corrected X-ray positions calculated by eposcorr. If the refined astrometric solution was not accepted the flag POSCOROK was set to False and the columns RA_CORR and DEC_CORR are the uncorrected values RA and Dec.

(ii) the estimates of the systematic error on the XMM-Newton astrometric frame:

The intrinsic systematic 1-sigma error for XMM-Newton fields (i.e., before any correction is applied) was estimated from the width of the distributions of position shifts found in eposcorr runs producing a value SYSERRCC = 1.5 arcsec which is assigned to detections in all fields for which an acceptable astrometric correction using eposcorr was not found (that is, POSCOROK is set to False).

The residual systematic 1-sigma error for XMM-Newton fields for which acceptable astrometric correction using eposcorr was possible is set to SYSERRCC = 0.5 arcsec (that is, POSCOROK is set to True); this value wasestimated from a comparison of the width of the distribution of corrected XMM positions with optical catalogue objects to expectations based on the statistical errors alone.

3.2.6 Cross-correlation with astronomical catalogues

Each EPIC source position has been cross-correlated with a large number of astronomical catalogues (109 catalogues and 80 tables extracted from articles) covering the electromagnetic spectrum from radio to hard X-rays (Tab. 3.2). The SIMBAD and VizieR databases at the Centre de Données astronomiques de Strasbourg (CDS) and the NASA/IPAC Extragalactic Database (NED) were used for that purpose. The criterion for considering an catalogue entry as being coincident in position with an X-ray source is that the distance between them lies inside the 99.97 % confidence interval (3 Gaussian sigma on the angular separation) given their respective positional uncertainties. Error ellipses are explicitly taken into account in the computation.

The catalogue itself provides a summary extract of the full catalogue cross-correlations, listing up to nine of the closest matches found. The full catalogue cross-correlation results are provided in the catalogue products (see appendix A.1) and are accessible from the on-line catalogue databases XSA, XCAT-DB, and LEDAS.

In the summary columns for each cross-correlation entry we give the astronomical catalogue name (CAT_NAME_N, with N denoting the number of the cross-correlation match), name of the catalogued source (CAT_ENTRY_N), its position (CAT_RA_N, CAT_DEC_N, CAT_RADEC_ERR_N) and distance from the XMM source position (D_EPIC_CAT_N), and one measurement (CAT_VAL_N) along with the type of the measurement as stated in the respective catalogue (CAT_MEAS_N).

For convenience, the nearest catalogue-specific cross-correlation match for each of the six selected databases (ROSAT, USNO, GSC, 2MASS, APM, SIMBAD/NED) are also given (see Sec. 4.4.6).

3.3 Additional Processing

3.3.1 Visual screening

Visual screening of the processed data was performed in two stages using the Data Product Screening Subsystem (DPSS). The DPSS provides GUI driven visualization and interactive flagging and reporting facilities.

Stage 1 screening was performed as an overall quality check of an observation and as an assessment of the suitability of the data products for stage 2 screening. Stage 1 screening consisted of the examination of one total band image from each EPIC camera which had been selected by the pipeline for source searching.

Stage 2 screening consisted of inspecting individual sources in the total band images and assessing the suitability of each source for potential inclusion in the final catalogue. As a result of stage 2 screening each detected source had a number of quality flags associated with it.

Stage 2 screening

The 1XMM source catalogue is compiled from the EPIC emldetect source lists that are produced by pipeline processing of selected observations (see Sec. 3.2.3). Whilst the detection procedure has good reliability, a number of spurious detections remain in the emldetect source lists, largely due to image defects, source peculiarities or features of the detection software. There may also be obvious sources that have not been detected.

The overlay of the source list over the image for each accepted exposure was visually inspected. The aim was:

(i) to identify probable spurious detections which are due to known problems (cf. Tab. 3.3), 
(ii) to identify sources which are considered to be real but are detected in an environment where spurious detections are known to occur (cf. item(i)), and
(iii) to identify sources where the parameters (notably position and count rate) could be affected, e.g., by a position close to a defect, a CCD edge or edge of the field of view.

The default position of every flag is F for False. When a flag was set it means it has been changed to T for True. There are 12 flag positions in total of which the first 7 have been assigned the meanings given in Tab. 3.4:


Table 3.4: Flag Keys
1 false detection
2 no visual inspection
3 the source lies within an extended region
4 there are nearby sources
5 the source position is suspect
6 the source is near an edge (of a CCD and/or the field-of-view)
7 a source comment is given

In this catalogue only the flags 1, 3, 5, 6 and 7 were set explicitly. Flag 2 was discarded after the initial experiences with the screening, and flag 4 is covered by the source comments given. Flag 6 has been set either (i) if the field of view was sufficiently discernible to be traced by eye, or (ii) if the source detection and/or its count rate was obviously affected by its position close to a CCD edge or the edge of the field of view.

The source comment is a short explanation of the reason why one or more flags have been set. Table 3.3 gives an overview of the comments given and the flags set, together with a more detailed explanation for it.

Hot pixels (counts > ~12) that were not excluded from the source detection process (cf. exposure maps, Sec. 3.2.3 a)) and triggered a detection have been flagged; warm pixels with about 5  ~ 10 counts have been flagged on PN exposures only. MOS-2 (and MOS-1 on a much lower scale) is known to exhibit various warm pixels which can occasionally trigger a source detection (see Sec. 6.1). They have not been flagged by hand since they are visually obvious only on exposures with low exposure time.

The source detection algorithm occasionally fails to pick up sources, most of which are close to a neighbouring brighter source. This is particularly obvious in very crowded fields. A list of such sources together with their observation ID, exposure ID and estimated RA and DEC (J2000) are given in the Tab. 3.5. A source comment is also given where applicable.

During screening, it became obvious that for some fields it was difficult to distinguish between obvious spurious sources and real sources (i.e., fields with diffuse and very bright extended emission; fields with extended clumpy emission, e.g., WR25; dense fields with an overlaid extended emission, e.g., M31 core). We have excluded the more problematic observations from the catalogue since false-detection flagging could not reliably be applied here.

3.3.2 Thumbnails

Thumbnail images have been made of every source in the catalogue. A maximum of nine thumbnails is available per source: thumbnail images were made in the bands 6, 7, and 8 (0.2 - 2.0 keV, 2.0 - 12.0 keV, 0.2 - 12.0 keV, see Tab. 3.1) and for each of the available EPIC cameras from the set MOS-1, MOS-2 and PN. The thumbnails are stored as PNG files.

Each thumbnail image is 4 arcmin2. The images are not smoothed. The image data are taken from the fits images which form part of the catalogue product set, and thus embody the same X-ray event selections as these images. The brightness scaling of the thumbnail images is linear, but pixel brightness is truncated at a given saturation value. The 'heat' colour map is used, and the images are scaled so that the pixel range from 0 to the saturation limit spans the colour map. The value of the saturation is calculated for optimum display of the source at the centre of the field.

Green cross-hairs are overlaid over the centre of the image. These span 1.92 arcmin, with an 0.8-arcminute wide hole in the middle.

The legend at the head of each image gives the following:

4. Catalogue description

4.1 Catalogue creation

The catalogue has been created by the merging of the outputs from the pipeline processing (Sec. 3.2), visual screening (Sec. 3.3.1) and other additional columns which are explained in Sec. 4.2.2

The catalogue entries for each individual field have been created by running the SAS task makecat on the individual camera source lists, the merged EPIC source list and the catalogue cross-correlation summary file for this field. This task simply combines information about a given detection from the various input files, and appends it to a list which ultimately contains a row for each detection. Finally, the columns described in Sec. 4.3 have been modified and extra columns which are described in Sec. 4.4 have been added.

4.2 Column overview

In this section, the layout of the catalogue is described and an overview of every column is given. For each observation there are up to three exposures, that is, one per camera, and each exposure is divided up into several energy bands (Tab. 3.1). Consequently, the data can be organized on different levels: on the observation level there are the final mean parameters of the source; on the exposure or camera level the data for each of the up to three exposures/cameras are given, in the order PN, M1, and M2; and on the band level we give the energy-dependent details of the source parameters. The description of the table columns reflects this.

Each column in the catalogue is described in the links given below. The column name is given in capital letters, the FITS data format in brackets and the unit in square brackets. If the column originates from an SAS task, the name of the task is given to the right hand side and a link is set to the online SAS package documentation on http://xmm.vilspa.esa.es/sas/current/doc/packages.All.html. A description of the column and possible cross-references follow. The columns in the catalogue FITS table are in arbitrary order and can be identified via their name.

Entries with NULL are given

Details of the columns

Part 1: Cols 1 - 7: Identification of the source
Part 2: Cols 8 - 17: Details of the observation and the processing
Part 3: Cols 18 - 32: Details of the exposures
The details for each exposure such as filter, observing mode and exposure times are given.
Part 4: Cols 33 - 72: Coordinates
The mean external and internal equatorial and Galactic coordinates are followed by these coordinates for each camera, the detector coordinates and the CCD chip coordinates.
Part 5: Cols 73 - 99: Flags
This part lists the flags to qualify the source. The summary flag, which gives an overall assessment for the source, is followed by particular flags and comments for each detection. A distance to the nearest neighbouring detection in the same exposure is also given.
Part 6: Cols 100 - 153: Source parameters
The parameters of the source on the observation level (first three columns) and the exposure level are given here: fluxes, the detection likelihood, hardness ratios, count rates, the source counts, and other parameters from the source detection process.
Part 7: Cols 154 - 258: Energy-dependent source parameters
This part lists the source parameters for each band separately, since these bands formed the inputs to the source detection process.
Part 8: Cols 259 - 379: Cross-correlation matches with other catalogues
The details of the cross-correlations with astronomical catalogues by two different approaches are given here: first, the nearest match in each of 6 major catalogues (ROSAT, SIMBAD/NED, USNO, GSC, 2MASS and APM) is given; second, up to nine nearest cross-correlation matches with a compilation of various astronomical catalogues including the above mentioned catalogues (see Sec. 4.4.6 for details) are listed, sorted by increasing distance to the XMM source. The total number of these cross-correlation matches is given at the beginning of the second section. As a consequence, there can be some duplications in the catalogue entries between the two approaches.

4.3 Modifications to the pipeline output

4.3.1 Corrections to fluxes

The total band flux in each camera (xx_FLUX) and their errors (xx_FLUX_ERR) have been recalculated during the construction of the catalogue due to a problem with the original values being biased by large systematic errors in the high energy bands. They have been recomputed from the total count rates (xx_TOT) and their errors (xx_TOT_ERR) using the formula:

Flux = Count_Rate * ECF

where the ECF is an energy conversion factor calculated using the most recent detector matrices and assuming the spectral model given in Sec. 3.2.3 f).

The derived ECF to convert the count rate in [cnts/s] to flux in [erg/s/cm2 * 10-11] for the total band are given below for each camera and filter:

Camera Thin Medium Thick
MOS-1 1.0692 1.1018 1.3028
MOS-2 1.0650 1.0973 1.2966
PN 0.3112 0.3233 0.3931

The EPIC-PN Thin1 and Thin2 filters have been assumed to have the same transmission.

4.3.2 Mean EPIC Flux

Instead of using the averaged mean fluxes from the SAS task srcmatch the weighted mean of the individual camera total band fluxes (EP_FLUX) and the error on the weighted mean (EP_FLUX_ERR) are given, where:

EP_FLUX_ERR = SQRT( 1.0 / SUM( 1 / FLUX_ERR2) )

4.3.3 Corrections to detection likelihood values

The significance of each detection is given as a figure of merit detection likelihood (xx_DET_ML) for each band, camera and source. A problem with these values was discovered late on in the production of the catalogue giving rise to the following warning (which was issued in XMM-News):

"A coding error has recently been discovered in the SAS task emldetect which performs EPIC source parameterization. Due to this error, the detection likelihood values reported by emldetect (i.e. in the column DET_ML in the output files) are overestimated by a factor 2 or more, depending on the number of input images. Other aspects of the performance of emldetect are unaffected.

As emldetect is used in the standard XMM-Newton processing pipeline, this problem affects EPIC source lists produced with SAS 5.4.1 or earlier. Users are advised to treat with caution any detections in existing source lists with low DET_ML values (especially DET_ML <~ 20) as the number of spurious sources per EPIC image becomes significant at these values.

This error will be corrected in the next version of the SAS."

The xx_DET_ML values given in the catalogue have been corrected for this problem. None of the associated data products (i.e., the source lists) have been corrected.

4.4 Additional columns

4.4.1 Additional source-designation columns

Each entry in the catalogue (i.e., each detection) has been assigned a unique running number, SRCINDEX, for ease of identification. Independently a name, XMMSRCNAME, has been assigned to each detection based upon the IAU registered classification 1XMM. The form of these names is "1XMM Jhhmmss.sSddmmss" where hhmmss.s is taken from the eposcorr corrected right ascension coordinate given in the column RA_CORR and Sddmmss is the sign and eposcorr corrected declination taken from the column DEC_CORR. Note that it is implicit in the IAU naming scheme (which is a naming scheme for detections) that there is no guarantee that two detections of the same source in different observations will have the same name nor, at least in principle, that two detections which have the same name are indeed different observations of the same source.

Unique source number

Many observations in the catalogue either have been observed more than once or their field of view overlaps with other (independent) observations, which means, some of the sources have been observed more than once. To find such sources and identify them as the same source, indicated by UNIQUE_SRCNUM, we determined the distance between each pair of sources and compared this to the distance of four times the combined positional errors of the two sources:

D = 4 * SQRT( (RADEC_ERR(1)2 + SYSERRCC(1)2) + (RADEC_ERR(2)2 + SYSERRCC(2)2) )

with RADEC_ERR(i) and SYSERRCC(i) (see catalogue table) being the statistical error and systematic error, respectively, of the source. To avoid a number of unlikely matches due to very large positional errors of a few of the sources we have restricted the maximum of the statistical error of each source to 5 arcsec. This number seems practical since there are only 79 sources in the catalogue (0.14% of the sources) that have a statistical error larger than 5 arcsec.

Due to a problem with the statistical position error, RADEC_ERR (see Sec. 6.3) a minimal statistical error of 1.5  arcsec was assumed if the sum of the source counts in the exposures where the source was detected was less than 100 counts. All statistical errors larger than 1.5 arcsec were kept unchanged.

If the distance between the two sources is smaller than D, then the two sources were allocated a common unique number. Any further source that was found within the search radius of one of the sources already in such a 'cluster' was then added to it and allocated the same unique number. This way we avoided allocating more than one matching number to a source (e.g., if a source was found lying between two others for which the search radii do not overlap, all three sources were identified as being the same).

In the process, 27 genuine neighbours have been assigned the same unique source number (that is, detections within the same observation and exposure). They are listed in Table 4.1. In each case, the detection has also been identified on at least one other observation (not listed).

In six of these cases the source was split in two by a CCD gap on one of the exposures, with the second detection being flagged as false. In 14 cases the source has been flagged as false in an environment that is known to cause spurious sources lying close together. In six cases no flag had been set, but the distance of the pair in the given observation is larger than the search radius of each detection. The reason why these have been assigned the same unique source number is that there is another detection on a different observation of the same region that lies between these two and connects them into a cluster, as explained above. In one case both objects have been flagged as possibly extended.

4.4.2 Additional time information

The columns MJD-OBS and MJD-END give the beginning and end of the observation (from columns DATE-OBS and DATE-END), in Modified Julian Date format. The column DURATION gives the length of the observation in seconds, computed from DATE-OBS and DATE-END.

4.4.3 The Target Flag

The column TARGET_FLAG indicates for each detection whether it is likely to be the target of the XMM-Newton observation. This flag value may be of use in constructing unbiased source samples.

To assign the flag, the SIMBAD Astronomical Database was used to convert the name of the XMM-Newton observation target into celestial coordinates. This position was then compared against the position of catalogue detections in that observation. For those detections where the positions match within a nominal 10-arcsecond radius the flag was set to True (note that the positional accuracy is not always available for the target position, thus preventing a more quantitative approach).

Note that significant numbers of observations are of an extended or diffuse object, or the target may be multiple objects. In these cases it may not be possible to assign the TARGET_FLAG. In those cases where the TARGET_FLAG = True, users are reminded that this does not guarantee the XMM-Newton detection genuinely corresponds to that celestial object.

4.4.4 The Summary Flag

The column SUMM_FLAG provides on overall indication, as a single integer value, of the outcome of the visual screening process performed for each EPIC camera (PN, M1, M2). It is derived directly from the screening-flag values for the individual cameras (see Sec. 3.3.1), and denote 4 = good, 2 - 3 = possible problems, and 0 - 1 = probably bad.

In more detail, the summary flag is defined as follows, with the individual camera flags being set to True for:

1 = false detection
3 = the source lies within an extended region
5 = the source position is suspect
6 = the source is near an edge

Summary flag 4 is given if flags [1,3,5,6] for the three cameras [PN,M1,M2] are all False, i.e., there are no negative indications for this detection.

Summary flag 3 is given if flags [3 or 5] for any of the cameras [PN,M1,M2] are True, and flags [1 and 6] for the three cameras [PN,M1,M2] are all False, i.e., the detection is considered to have some possible problems.

Summary flag 2 is given if flags [6] for any of the cameras [PN,M1,M2] are True, and flags [1] for the three cameras [PN,M1,M2] are all False, i.e., the detection is close to a CCD edge/gap in at least one camera.

Summary flag 1 is given if flags [1] for the three cameras [PN,M1,M2] are all True, i.e., it is considered that the detection is likely to be spurious.

Summary flag 0 is given if all the cameras [PN,M1,M2] which detected the source have flag [1] True.

4.4.5 Off-axis Angle

A column has been added to give the off-axis angle of the source in each camera (xx_OFFAXANG). These values, given in arcminutes, have been computed from the detector coordinates of the source using the assumption that the optical-axis intersects with the focal plane at DETX = 0, DETY = 0 in each camera. Recent studies show that the actual position of the optical axis is displaced by ~1 arcmin for the MOS-2 and PN cameras (cf. Sec. 6.5).

4.4.6 Additional catalogue cross-correlation columns

For each cross-correlation entry we give the astronomical catalogue name (CAT_NAME_N, with N denoting the number of the cross-correlation match), name of the catalogued source (CAT_ENTRY_N), its position (CAT_RA_N, CAT_DEC_N, CAT_RADEC_ERR_N) and distance from the XMM source position (D_EPIC_CAT_N), and one measurement (CAT_VAL_N) along with the type of the measurement as stated in the respective catalogue (CAT_MEAS_N). While the astronomical cross-correlation products provide the catalogue name and a measurement (see Sec. 3.2.6), the other columns are additionally supplied for this catalogue

The column XCORRMATCHES gives the total number of cross-correlation matches with astronomical catalogues. The first nine of these (sorted by increasing distance to the XMM source position) are listed here in the XMM catalogue. For ease of use, the nearest catalogue-specific cross-correlation match for each of the six astronomical databases listed below are given as well. Consequently, some of the cross-correlation entries are duplicated.

1. ROSAT with four catalogues being used (prefix ROS in the column names):

1RXS : ROSAT PSPC All-Sky Survey Bright Source Catalogue (BSC)
1RXS : ROSAT PSPC All-Sky Survey Faint Source Catalogue (FSC)
2RXP : ROSAT PSPC pointed
1RXH : ROSAT HRI pointed

2. ASD with two All-Sky Databases being searched (prefix ASD in the column names):

SIMBAD
NED

and one cross-correlation entry for each for the following All-Sky catalogues:

3. USNO: USNO-A2.0 (prefix USNO)
4. GSC: GSC 2.2.1 (prefix GSC)
5. 2MASS: 2MASS  2nd incremental release (prefix TMASS)
6. APM: APM  Northern Catalogue (prefix APM)

5. Simulations and verifications

The number of false detections per field bears a simple relationship to the number of 'beams' in the field and the detection likelihood L:

N_false = N_beams * exp(-L).

The source detection algorithm includes a calculation of likelihood, but in order to validate and calibrate this quantity, a program of simulations was undertaken.

Software was written to create simulated XMM EPIC images and to perform source detection on these fields.

5.1 The starting source list

The starting point for the simulation is a list of fake sources with random positions and fluxes. The source positions were evenly distributed over a section of celestial sphere within a mask made from the non-zero regions of the exposure map to be used, plus an extra 100-arcsecond pad all around the edge.

The source fluxes were designed to follow a two-slope logN-logS distribution with a sharp faint-end cutoff (see e.g. Fig. 5.1). The faint-end slope of -0.8 ran from the faint-end cutoff at 3.0*10-18 to the break at 7.0*10-15 erg/cm2/s; the bright-end slope was  -1.5; the whole being normalized to 200 sources deg-2 at the break, giving nearly 100 000 sources deg-2 in total. These flux values were defined in the energy band 0.5 to 2 keV. Flux values for other energy bands were calculated by assuming that each source had the same power spectrum with a photon index of -1.4. This shape of spectrum, together with the details of the flux distribution, were chosen so as to reproduce as closely as possible real high-latitude X-ray logN-logS distributions such as those reported in Hasinger et al (2001) or Mushotzky et al (2000).

5.2 Images

Simulated images were constructed from the source lists in two steps. A total band flux image was first constructed from the entire list, using the appropriate point spread function (PSF) at each position. The ideal procedure would be to construct a separate image for each energy band of interest, using an energy-integrated PSF, because the PSF varies with energy. However, since access of the PSF is computationally intensive, a number of short cuts were taken. Firstly, a single representative energy (1 keV) was used when obtaining the PSF. Secondly, for sources too faint to stand much chance of being detected (flux density less than 9.2*10-19 erg/cm2/s eV-1 at 1 keV, equivalent to about 4*10-4 counts/s within 0.2 - 12 keV), a simple Gaussian PSF (of 5.0 arcsec sigma) was used rather than that obtainable from the XMM calibration information. Use of a single spectrum for all sources also made possible this creation of a single starting image to be used for each energy band.

Where the XMM PSF was used, this was truncated in a square window scaled to include 95% of the total PSF flux. The Gaussian PSF was truncated in a square window of edge length equal to 14*sigma (= 70 arcsec). Both kinds of PSF image were then vignetted by a function of the form 1 - x2 (in order to avoid discontinuities at the edge) and renormalized.

The second step comprised the following: for each energy band to be used in source detection, (i) the total band flux image was multiplied by the appropriate conversion factor to obtain an image in counts/s/pixel; (ii) the result was multiplied by an exposure map; (iii) appropriate background was added; finally, (iv) for each image pixel, the resulting real-valued counts were converted into random integer values with a Poisson distribution about the starting real value.

Flux-to-rate conversion factors were calculated 'on the fly' by numerical integration of the assumed source spectrum multiplied by the effective area, a filter transmission and a representative quantum efficiency curve.

Exposure maps from a real observation were used for added verisimilitude.

A simple and robust algorithm was used to obtain random numbers with a Poisson distribution. The resulting images were not found to exhibit any significant departure from Poisson statistics.

The added background was composed of a vignetted and a non-vignetted part of 1.0*10-7 and 4.0*10-7 counts/s/arcsec respectively, which gave a total of about 0.38 counts/pixel in the centre of the field of view at the exposure used of 47 ks. This background count rate was chosen to be close to the median background count rate seen in the catalogue. It was thought to be less important to choose an exposure value similar to the majority in the catalogue (the median value seen in the catalogue is 25 ks for MOS-1) and hence a higher value was chosen in order to increase the number of sources found per field. 
 

5.3 Source detection

Ideally the catalogue pipeline would have been used to perform source detection but for technical reasons this was found to be impractical. A script was therefore written to mimic the source-detection part of the catalogue pipeline. Images were constructed for each of the five standard energy bands and source detection in these bands proceeded in parallel.

5.4 Extraction of results

In order to interpret the results of the simulation, it is necessary to separate real from false detections. The original source parameters for each reliable detection must also be retrieved for purposes of comparison. These needs were fulfilled in two steps. In the first step, each detection was matched with that source from the starting list (called from here on 'the sim source') deemed most likely to have caused the detection. In the second step, the probability that the match occurred by chance was estimated.

The matching step was performed by finding the nearest sim source in an abstract 3-dimensional space constructed (separately for each detection) as follows. The first two dimensions comprise the image-plane coordinates, divided by the respective detection uncertainties. The third coordinate is a function of source flux, such that the sim sources are evenly distributed in this direction. This third coordinate ranges from 0 to 1/(uncertainty in flux detection). The resulting space has the property firstly that the sim sources are evenly distributed throughout a bounded area, and absent outside it; and secondly, that the (x,y,flux) error ellipsoid of the detection becomes transformed to a sphere.

The probability that the detection was false was estimated from the volume V of the sphere in the above space that had the detection at its centre and the distance to the nearest sim source as its radius. P_false is then

P_false = 1 - exp(- V * p)

where p is the density of sim sources in the space.

A control was performed to assess the reliability of this separation, by attempting to match the detected sources from one field with the list of sim sources used to generate another. As expected, the resulting values of P_false were evenly distributed between 0 and 1. Comparison of the results of normal runs with eyeball searches of the same fields also indicated that P_false for reliable sources was almost always less than a few percent. It was thus decided to take the 5% mark as the boundary between 'real' and 'false' detections. Numbers of false detections were thus estimated by counting all sources with P_false > 0.05 and multiplying by 1/0.95.

5.5 Empty-field simulations

Because the number of false detections was higher than expected, a program of simulations with background but without sources was also performed. There are several reasons for doing this. Firstly, the image statistics are much simpler and therefore the results are easier to predict. Secondly, the results are easier to interpret, since every detection in an empty field is by definition a false detection. Thirdly, this approach is computationally quicker, so a number of runs with different parameters and treatment can easily be performed.

The images were made in a similar fashion to those already described, apart from the omission of sources. Source detection was, for simplicity, carried out in only one band.

5.6 Results

The simulation script was run several hundred times in order to arrive at statistically significant results. Note that all the results presented below refer to the MOS-1 camera exclusively. A comparative study between simulations of the three cameras has not yet been completed.

a) Systematic errors in count rate determination

It is known that the count values of faint sources tend to be overestimated. See Cruddace et al (1988) for a description of a similar effect detected in ROSAT data. In this connection it is worth emphasising that the source detection algorithm used to compile the present catalogue is a direct descendent of that used for ROSAT. Figure 5.2 shows the ratio between the detected count rate and the true count rate of the simulated source.

Note that the source-matching algorithm of the simulation software does not as yet correct for this effect. Hence true matches between detections and the original source list have probably been missed because the software was looking at too high a value of RATE. The performance of the source-detection software is thus probably a little better than shown herein.

b) Proportion of false detections

Figure 5.3 shows integrated source counts as a function of DET_ML, the likelihood figure that is calculated by the source-fitting software.

The dashed black line is proportional to the expected number of false detections per field in a single-band detection scheme. The number of beams has been chosen so that the theoretical line approximately overlays the red line, but is not otherwise significant. It is a not unreasonable figure for the number of beams (which governs the vertical positioning of the line on this plot) but the true number is difficult to calculate and is not known at this time.

The red curve shows the result from an empty-field simulation at a background level of 0.35 counts/pixel, consistent both with the full-field simulation runs and the catalogue values. (In fact the number of false detections seen in the blank-field simulations did not vary significantly with background, varying over the whole DET_ML range by a factor of only about 2 as the background level was varied by a factor of 30. Indeed much of this factor of 2 was attributable to imperfections in the background maps.)

The fall-off of the red curve from the black theory line at DET_ML below about 8 is a result of loss of sources due to non-optimal setting of an internal likelihood acceptance threshold in the source-detection chain. The count of real sources is deficient in the same proportion as the count of spurious sources. This error was unfortunately only discovered after the catalogue was created.

The blue curve represents the integrated total source counts in a full 5-band detection of images made from simulated source lists, as described above. The final curve, the green one, is the integrated number of sources identified as spurious in the same detection scheme. Thus the fraction of spurious sources above any given DET_ML value can be found by dividing the green curve by the blue one. Note for example that it is necessary to discard sources below DET_ML ~ 7 in order to ensure that less than 1% of the total source detections are spurious.

It is not presently known why the green curve is so different in both shape and height from the single-band curves. However some increase in false detection rate might be expected due to the more realistic 'lumpy' nature of the X-ray background in the full-field simulations, composed as it is mostly out of numerous unresolved sources. The fact that detection is carried out in parallel in 5 energy bands, and therefore that the total DET_ML value for each source must be greater than that in any single band, may also have a bearing. More investigations of this are planned.

c) Completeness

An integrated logN-logS curve for the 'reliable' detections (black curve) is compared to the logN-logS of the input model (red dashed line) in Fig. 5.1. The tally of detected sources appears to be still 90% complete at a (true, i.e., corrected) count rate of 1.0*10-3 counts/s. Recall that this is at an exposure time of 47 ks: completeness limits for other values of exposure time should be scaled accordingly.

Figure 5.4 shows the same numbers in ratio form. The integrated number of detections was divided by the expected value to give this graph. Note that, because the ideal model rather than the actual, noisy distribution of simulated sources was used, it is possible for this ratio to slightly exceed 1, as it does around RATE = 0.002 counts/s.

References:

Cruddace R G, Hasinger G & Schmitt J H, Astronomy from Large Databases, eds Murtagh F & Heck A, p177 (1988)

Hasinger G et al, Astronomy and Astrophysics 365, L45-50 (2001)

Mushotzky R F, Cowie L L, Barger A J and Arnaud K A, Nature 404, 459-464 (2000)

6. Known problems and open issues

6.1 False detections due to 'warm pixels' in MOS-2

It is well known that the three EPIC cameras contain pixels which are permanently 'hot', i.e., they have an associated noise or dark current which makes them bright in all observations. These are excluded by on-board software or alternatively by the SAS task badpixfind which runs in the pipeline software. Each camera also contains so-called 'warm' pixels which are bright in some observations but not others. These have not been excluded from the processing and are evident by an excess of sources found at these points in the detector plane. The locations of these warm spots on the detectors are given, together with an estimate of the number of false detections which they contain, in Table 6.1 (the accuracy on the DETX/Y positions is +/-150 pixels or 7.5 arcsec).

Figure 6.1 shows the sources which have only been detected in the MOS-1 camera. The bright points at the top of the image indicate the area of the focal plane which is only seen by MOS-1. The other bright spots, apart from the central point, are warm pixels. It is worth noting that, apart from these few bright spots, the distribution of the sources is very uniform and there is no reason to suspect that they are not real. Indeed the spectral properties of these sources are consistent with sources which have been detected in two or more cameras and are hence presumably real (Fig. 6.2).

The source distribution in MOS-2 (Fig. 6.3) shows brighter and more frequent warm pixels than MOS-1. It also shows a strong trend for an excess in CCDs 1, 4, 6, and 7 (cf. Fig. 21 in the XMM-Newton Users' Handbook, Sec. 3.3.1.1.). The most probable explanation for this is a low-energy noise problem which is known to affect some MOS-2 CCDs. The spectral properties of these sources fall into two bands (Fig. 6.4); a swathe of about 2000 sources which follow the trend for real sources and then a further ~3000 which are very soft.

A first cut at excluding these excess MOS-2 only detections may be made by looking at the DET_ML values for bands 1 and 2. The single camera detections for MOS-1 and PN mainly have DET_ML for band 2 greater than that of band 1 (Figs 6.5.a) and c)). For MOS-2, about 50% of these detections have a higher band 1 likelihood (Fig. 6.5.b)). In numbers, 2601 MOS-2 only detections have DET_ML(bd1) > DET_ML(bd2) as opposed to 386 such detections for MOS-1.

The selection

.not.   (mos_2_only_detection   .and.  m2_1_det_ml > m2_2_det_ml)

would exclude 2600 sources of which ~400 may not be due to the 'warm-pixel' effect in MOS-2.

A few bright spots are visible in the EPIC-PN source map (Fig. 6.6) together with a number of enhancements. The central square feature reflects the fact that the MOS cameras are sometimes used in Small Window mode where the outer portion of the central chip is not exposed, leading to an excess of PN sources. The enhancement at the left hand edge of the field of view principally reflects sky area which is only observed by the PN. As a check the spectral properties of sources observed uniquely with the PN are similar to sources observed in two or more cameras (Fig. 6.7).

6.2 False detections due to 'hot pixels' in the PN

The PN detector shows two hot pixels that were often detected by the source detection algorithm. These are:

Hot Pixel #1 at (RAWX : RAWY) = (55 : 75) on CCD 1 and

Hot Pixel #2 at (RAWX : RAWY) = (39 :  8) on CCD 10.

Hot Pixel #1 has been detected 290 times of which 28 detections were not flagged as false (PN_VER_FALSE =  T). Hot Pixel #2 was less frequently detected (59 times) but was relatively more often missed during screening (36 were not flagged as false) since it is located near the edge of the CCD which is often slightly noisier. Thus, only 0.24% of the sources flagged as "good" detected with the PN (26130) are still due to one of these two hot pixels.

6.3 Effect of background characterization on fluxes of faint sources

As part of the process of source detection, the images with the sources excluded are modeled by splines using the SAS task esplinemap. Generally this process reduces the uncertainty in the background estimate (compared to a annular box estimate), leading to a better background-subtracted source strength estimate. For very low source fluxes the quality of the background estimate becomes very important, a poor background model can lead to systematic errors in the source count, count rate and flux estimate. The few observations in which the background model fit was highly suspect have been removed from the catalogue (these contained extremely bright extended and point sources). Users are warned that the faintest sources may have under-reported count-related uncertainties in some fields.

6.4 Note on statistical position errors

Since the initial compilation of the catalogue it has been discovered that the statistical positional error (RADEC_ERR) is underestimated for a significant number of faint detections. This is the result of a previously undiscovered coding error in the detection parameterization routine. The majority of the detections affected by this error have < 50 counts in a single EPIC camera (i.e., PN, MOS-1 or MOS-2) and the effect appears to be strongly linked to detections which have negative detected counts in one or more of the 5 energy bands (Tab. 3.1). Positional errors quoted in the catalogue for such detections are often < 0.5 arcsec, whereas the expectation from the width of the XMM-Newton PSF and the signal-to-noise of the detections is that the correct values should typically lie in the range 1 - 2 arcsec. Note that it is not possible to correct this error explicitly in an accurate way without substantial reprocessing of the catalogue data, hence it has been decided to issue the catalogue with this defect uncorrected at this stage.

This error has the following potential consequences:

(i) a small bias may be present in celestial positions of detections where one of the camera positions has an erroneously low error, leading to this camera position having too high weight in the merged position;
(ii) it may affect the matching of detections in different EPIC cameras, leading occasionally to matches being missed (and hence to detections being listed as independent whereas in reality they relate to the same object);
(iii) searches for matches with external catalogue sources will extend to too small a search radius as this radius is, in part, set by the statistical positional error assigned to the detection. This effect is mitigated by the fact that the search radius involves a systematic positional error for each XMM-Newton field in addition to the purely statistical error component.

Our analysis suggests that the size of these effects [(i) - (iii)] is relatively small and has limited impact on the quality of the catalogue, but users of the catalogue need to treat the RADEC_ERR values for faint sources with caution. For any detection with less than 50 counts in one or more cameras, the appropriate RADEC_ERR value to assume is  ~1 - 2 arcsec in those cases where the quoted value is < 1 arcsec.

6.5 Calibration issues: Vignetting

In-orbit studies to map the response of the EPIC cameras imply that the centre of the vignetting function is not coincident with the optical-axis position measured on the ground Lumb et al (2002). The count rates and fluxes given in the catalogue have been calculated using an optical-axis position of DETX = 0, DETY = 0 for all three EPIC cameras. This is approximately true for MOS-1 but recent work on the Coma cluster in particular Finoguenov et al (2003) gives a position of DETX = 400, DETY = -1350 and DETX = 1250, DETY = 300 respectively for the centres of the MOS-2 and PN vignetting functions.

This leads to a spatial shift of the vignetting function within the exposure maps generated for these cameras such that sources in the direction of the shift show an excess and those lying in the opposite direction an apparent deficit of flux. The magnitude of the flux difference is both spatially and energy dependent but can reach 15% for the high energy bands of highly off-axis sources. This can be clearly seen by comparing camera fluxes as a function of detector position (Figs 6.8 and 6.9).

References:

A. Finoguenov, U. Briel, B. Aschenbach, 2003, 'Calibrating the Vignetting of XMM with Coma Cluster Centre observations', XMM calibration report.

D.H. Lumb, A. Finoguenov, R. Saxton, B. Aschenbach, P. Gondoin, M. Kirsch, I. Stewart, 2002, 'In-orbit calibrations of XMM-Newton Telescopes', Spie 4851 2003 in press 

6.6 Sensitivity maps

At present, sensitivity maps (indicating the expected minimum detectable source count rate at each point in the field of view) are provided as part of the standard set of XMM-SSC data products and as catalogue products. A single map in the band 0.5 - 4.5 keV (the XID band) is produced for each exposure of each observation.  Our investigations have raised some concerns as to whether the algorithm employed to generate these maps is an optimum one and users are advised for the present to treat these maps with some caution.

In the near future it is planned to implement a new algorithm and also to produce maps in the bands 6, 7, and 8 (0.2 - 2.0, 2.0 - 12.0, and 0.2 - 12.0 keV, respectively).  These new maps will be included in a future incremental release of the catalogue products.

6.7 ROSAT cross-correlation identifications

Since the compilation of the catalogue it has been noted that a small number of matches with ROSAT catalogue sources are missing. This arises for two reasons:

(i) the positional errors on 1RXS catalogue sources were erroneously assumed to be 90% confidence values whereas in reality they are 68% confidence values. This leads to a too small search radius being used for cross-correlation.
(ii) the positional errors in the publicly available 1RXH catalogue may be underestimated (private communication from ROSAT team at MPE, Max-Planck-Institute for Extraterrestrial physics). This again leads to a too small search radius being used for cross-correlation.

6.8 Undefined Boolean values

In the catalogue Boolean values which are undefined have actually been set False. However, this does not affect quality selections since all active settings of dubious quality were changes to the value True (e.g., the summary flag, Sec. 4.4.4).

6.9 Issues relating to specific catalogue columns

6.9.1 Detector coordinate values undefined

The xx_DETX/xx_DETY coordinates are undefined for two detections for MOS-1, for four detections for MOS-2, and for 817 detections for EPIC-PN. This is due to a combination of attitude problems with individual observations and an incorrect treatment of the edge of the field of view (FOV). It predominantly affects sources situated at the edge of the FOV. A related effect is that the off-axis angle xx_OFFAXANG is also undefined for these detections.

6.9.2 Chip coordinate values undefined

The xx_RAWX/xx_RAWY coordinates are undefined for about 100 detections in each camera. This generally occurs when the centre of the source falls within a chip gap or at the edge of a CCD.

6.9.3 Zero counts

Some sources exist with zero counts and zero error in a camera. This is due to a software problem. All these detections have flag 1 (for false detection, xx_VER_FALSE) set to True and the summary flag (SUMM_FLAG) is zero.

6.9.4 Duplicated columns

The xx_CUTRAD columns are duplicated: the extraction radius for the detection's subimage is given for each band but due to the nature of the detection technique (see Sec. 3.2.3 f)) it is actually the same for each band.

6.9.5 Unique source numbers

In 27 cases, the same unique source number (UNIQUE_SRCNUM) has been assigned to more than one detection in the same observation. They are listed in Table. 4.1. See Sec. 4.4.1 for details.

6.9.6 Field Identification

The column FIELD_CAT, which is a placeholder to identify the type of object of the field target, is not currently set.

7. Discussion of the catalogue

7.1 Numbers of detections and discrete sources

The catalogue contains source detections drawn from 585 XMM-Newton EPIC observations made between 2000 March 1 and 2002 May 5, selected according to a number of criteria described in Sec. 3. The sky distribution of these observations is shown in Fig. 7.1 a) and Fig. 7.1 b) in Galactic and equatorial coordinates, respectively. Net exposure times in these observations range from  < 1000 up to  ~ 100000 seconds. The total area of the catalogue fields is  ~ 90 deg2, but taking account of the substantial overlaps between observations, the net sky area covered independently is  ~ 50 deg2 (see Sec. 7.3 for details).

The catalogue contains 33026 X-ray source detections with likelihood values > 8 and summary quality flag > 0, together with a further 23685 detections with lower likelihood values and/or summary quality flag = 0. These latter sources have lower reliability, since those with quality flag = 0 are deemed to be false detections due to processing deficiencies, whilst at likelihoods below 8 the fraction of spurious sources expected on a statistical basis increases rapidly. The 33026 X-ray source detections relate to 28279 unique X-ray sources (Sec. 4.4.1).

The number of detections and discrete sources as a function of detection likelihood is summarized in Fig. 7.2.

7.2 Expected number of spurious detections

The expected number of spurious detections as a function of likelihood has been estimated by simulation (Sec. 5) to be ~ 10, 5, 2, 1 per EPIC exposure (i.e., for a single camera) at likelihood thresholds of 6, 8, 10, 12.

7.3. Sky coverage / area

The total sky area of the 585 XMM-Newton observations is shown in Fig. 7.3 as a function of net exposure time. This figure shows both the 'nominal' sky area and 'actual' sky area, the latter correcting for field overlaps and the off-axis reduction in effective area. The somewhat smaller area for the PN camera is primarily due to the fact that PN exposures are included only if the camera is in full-frame or large-window mode (Sec. 2.2, item 3), although the net EPIC PN exposures are also somewhat lower overall.

The total independent sky area (i.e., 'actual') covered by the catalogue observations, i.e., correcting for overlap, is  ~ 52 deg2 for EPIC MOS-1 and EPIC MOS-2 and  ~ 44.5 deg2 for EPIC PN.

7.4. Example statistical distributions

7.4.1 Astrometry

Inter-comparison of the positions from each EPIC camera is shown in Fig. 7.4 which demonstrates the good agreement achieved with any residual systematic effects constrained to be <= 0.5 arcsec.

The overall astrometric accuracy of the catalogue is illustrated in Fig. 7.5 which shows the distribution of separations between XMM-Newton detections (RA_CORR/DEC_CORR)and optical objects in the USNO A2, GSC 2 and APM catalogues. These distributions are consistent with expectations for the quoted errors on the XMM catalogue positions after an approximate correction to the statistical errors for faint XMM detections was made to correct for the position error issue described in Sec. 6.4.

7.4.2 X-ray fluxes

A large fraction of the catalogue sky area is covered by 2 or 3 of the EPIC cameras, thus permitting a detailed comparison of the fluxes measured by each camera. The results of this exercise are shown in Figs 7.6 & 7.7 which show the inter-comparison of the MOS-1 and MOS-2 fluxes and the average MOS flux with the PN flux. The results show an overall good agreement between the camera fluxes.

A detailed comparison of the measured fluxes has been undertaken, see XMM-Science Ops Centre Technical Note CAL-TN-0023-v2.0.ps (Saxton, 2003). This confirms the good agreement between the camera fluxes and identifies and explains systematic offsets introduced by low source counts (Sec. 5.6.a)), background subtraction issues (Sec. 6.3), large off-axis angles (see Sec. 6.5) and extreme source spectra.

7.4.3 X-ray colours

The distribution of X-ray colours for catalogue detections is shown in Fig. 7.8 for PN camera detections (MOS camera results are very similar).

7.4.4 Archival catalogue cross-correlations

For the catalogue detections with likelihood values > 8 and summary quality flag > 0, a large fraction have matches with one or more archival catalogue objects (Sec. 3.2.6), but only  ~ 4% of the total have matches within 10 arcsec; these correspond to matches from external catalogues with relatively accurate positions.

Matches with specific catalogues include (see Sec. 4.4.6):
 

 - 8668 with USNO A2
 - 5923 with GSC 2
 - 4791 with APM
 - 2274 with 2MASS
 - 4643 with ROSAT catalogue objects (1RXS, 2RXP, 1RXH)
 - 4683 with objects in the SIMBAD or NED collections ('ASD' within 10 arcsec)

There is of course substantial overlap between these catalogues so these are not independent values and the chance match rate for the larger catalogues is non-negligible.

Document revision history

Release No. Release Date Comments
1.0 7 Apr 2003 First release
1.1 9 Apr 2003 Sections 3.2.6, 4.4.1 (1st para), 7.4.1: revised text to improve clarity
1.2 15 Apr 2003 Section 3.2.2 d): revised step 4
1.3 28 Jan 2004 Minor corrections in the column descriptions part 3 and 4, in the Summary, Introduction, Section 3.2.5 (ii) and 3.2.6; Added Section 6.2; Changed section numbering in Section 6;

Appendices

A.1 Catalogue data-products description

The catalogue was produced using a modified version of the standard SSC pipeline (http://xmmssc-www.star.le.ac.uk and follow the link 'Pipeline Processing'). Thus almost all the products associated with the catalogue follow the standard specification, as described in the Data Files Handbook and the SSC products Specification available at http://xmm.vilspa.esa.es/external/xmm_user_support/documentation/index.shtml There are no products for the OM or RGS associated with the catalogue as the data from these instruments was not processed in making the catalogue.

In addition to the standard pipeline products, three other graphical product types were made during the catalogue pipeline processing. They were made from the other fits products, and are listed here

P*EMSRLI*.PNG  The camera ML detected sources plotted on the camera total band image
P*EXPMAP*.PNG  The camera ML detected sources plotted on the camera exposure map.
Sources are labeled with their camera xx_ML_ID_SRC numbers (listed in catalogue column nn_ML_ID_SRC, where nn=M1, M2 or PN)
P*FBKTSR*.PNG  A plot of the rates file fits product used to determine intervals of sufficiently
low background to be included when accumulating the image used for source detection

Finally, after the catalogue was made, one extra set of products was made: the thumbnail images (see Sec. 3.3.2). These graphical products were made from the fits images. These files were not made using the SAS, and for this reason they have filenames of the form C*SRCIMG*.PNG (the other parts of the filename follow the pipeline product standard).

A.2 List of observations used to construct the catalogue

List of observations ('fields').

A.3 Calibration files

A list of calibration files used.

A.4 Catalogue pipeline release notes

Summary and detailed release notes for the pipeline used to make the catalogue source lists and other products.

A.5 XMM catalogue conference paper

The XMM-Newton serendipitous source catalogue. Watson et al, 2003, Astron. Nachr., 324, 89