This notebook is available at https://github.com/SterlingYM/PIPS/tree/master/docs/data_manip.ipynb

Data preparation and manipulation

[1]:
import PIPS
import matplotlib.pyplot as plt
import time
import numpy as np
[3]:
PIPS.about()
--------------------------
-    Welcome to PIPS!    -
--------------------------
Version: 0.3.0-beta.1
Authors: Y. Murakami, A. Savel, J. Sunseri, A. Hoffman, Ivan Altunin, Nachiket Girish
--------------------------
Download the latest version from: https://pypi.org/project/astroPIPS
Report issues to: https://github.com/SterlingYM/astroPIPS
Read the documentations at: https://PIPS.readthedocs.io
--------------------------

The photdata object in PIPS provides tools to perform basic operation for data manipulation. The implemented operation in the current version includes the following:

  • Initialization

  • Basic info / note storage

  • Basic information check (print(), len(), etc.)

  • Data identity check (are two photdata objects identical?)

  • Data cut (remove data points with uncertainty above threshold)

  • Data cut reset

  • Data concatenation

Initialization

The main function of photdata is to hold the photometric data. The object requires a list or array or three elements, x, y, and yerr each containing the same length of the time, magnitude (or flux), and magnitude-uncertainty data.

For convenience, datafile generated by LOSSPhotPipeline can be directly imported using a helper function data_readin_LPP.

[3]:
data = PIPS.data_readin_LPP('../sample_data/005.dat',filter='V')
x,y,yerr = data
print('data shape:\t',np.array(data).shape)
print('x shape:\t',x.shape)
print('y shape:\t',y.shape)
print('y-error shape:\t',yerr.shape)
data shape:      (3, 103)
x shape:         (103,)
y shape:         (103,)
y-error shape:   (103,)

The most basic method to initialize data is to give data=[x,y,yerr] as an argument for PIPS.photdata(). The initialization function also accepts label and band arguments, each of which should be used to store information.

[4]:
star  = PIPS.photdata(data) # the most basic way to initialize object
star2 = PIPS.photdata(data,label='Star2',band='V') # same as star1, but with more info

copying photdata object

Printing information

photdata accepts some basic python operations, such as str(), print(), len(), and hash(). These can be used to check certain properties of the object.

[5]:
# basic information check
print('* Basic information check')
print(star)              # the basic information
print(star2)             # the basic information -- label and filter info is included
print('len():',len(star))         # number of datapoints
print('hash():',hash(star))
* Basic information check
Photdata : band=None, size=103, period=None
Photdata Star2: band=V, size=103, period=None
len(): 103
hash(): 3349517558981431842

Creating an identical copy

photdata object has methods __copy__() and copy(). These methods by default performs a deepcopy instead of shallow copy (binding) to the original object to avoid possible error.

[15]:
star3 = star2.copy()    # creates a copy of star3 (including all attributes)

Checking if two objects are identical

[16]:
print('when two objects have the same data, they are considered identical')
print('star == star2:',star == star2)     # two objects are identical (hash-equality)
print('star == star3:',star == star3)     # two objects are identical (hash-equality)

print('\nIf any of the data points are changed, they are not identical anymore')
star3.x[-1] = 0         # manually change the data (don't do this!)
print('star == star3:', star == star3)    # False because data itself is changed
when two objects have the same data, they are considered identical
star == star2: True
star == star3: True

If any of the data points are changed, they are not identical anymore
star == star3: False

Note that, since copy() performs deep copy, star2 is unchanged even after star3 has been changed.

[17]:
star2.x[-1] # star2 has not been changed
[17]:
58805.179271

Applying cuts

The cut() function in photdata object offers a quick and easy way to apply data cuts. It takes in xmin, xmax, ymin, ymax, yerr_min, yerr_max arguments, and any combination of these can be applied simultaneously. Once cuts are applied, the existing cuts are effective even when a user applies another cut on different parameter later.

[18]:
period,_ = star.get_period(multiprocessing=False)
[19]:
star2.reset_cuts()
print('Before cuts:', star2)           # info before cuts are applied
star2.plot_lc(period,figsize=(4,2)); plt.show()
star2.cut(yerr_max=0.02)               # cuts data (based on x, y, and yerr values)
print('After a cut:', star2)           # now data is shorter
star2.plot_lc(period,figsize=(4,2)); plt.show()
star2.reset_cuts()
print('After cut reset:', star2)       # reset the cut settings
star2.plot_lc(period,figsize=(4,2)); plt.show()

star2.cut(ymin = 15.8)                 # a cut
star2.cut(yerr_max = 0.01, xmin=58676) # additional cuts
print('After another cut:', star2)     # now both yerr_min and ymin are applied
star2.plot_lc(period,figsize=(4,2)); plt.show()
print('* Even after cuts are applided, two originally identical objects are still identical.')
print('star == star2:', star == star2) # raw data is internally preserved
Before cuts: Photdata Star2: band=V, size=103, period=None
_images/data_manip_19_1.svg
After a cut: Photdata Star2: band=V, size=89, period=None
_images/data_manip_19_3.svg
After cut reset: Photdata Star2: band=V, size=103, period=None
_images/data_manip_19_5.svg
After another cut: Photdata Star2: band=V, size=14, period=None
_images/data_manip_19_7.svg
* Even after cuts are applided, two originally identical objects are still identical.
star == star2: True

Combining / concatenating two objects

It is sometimes necessary to merge two sets of data reduced/prepared separately. PIPS offers a quick way to merge two datasets using + (add) operator.

[22]:
print(star)
print(star2)
star4 = star + star2     # data concatenation (returns another photdata object)
star4.label = 'Combined' # always add labels
print(star4)             # star4 has more datapoints (notice cuts are applied to star2)
print('star == star4:', star == star4)  # once the data itself is changed, they are not identical
Photdata : band=None, size=103, period=0.6968874975991536
Photdata Star2: band=V, size=14, period=None
Photdata Combined: band=None, size=117, period=None
star == star4: False