Evidence of the Great Attractor and Great Repeller from Artificial Neural Network Imputation of Sloan Digital Sky Survey ()
1. Introduction
1.1. The Sloan Digital Sky Survey
The Sloan Digital Sky Survey (SDSS) began its life in 2000 and the legacy period lasted around 8 years. At the time, the SDSS telescope represented the largest digital camera in the world and was capable of taking up to 640 spectra simultaneously [1]. This advancement allowed for the creation of a 3D colour map of the Universe, containing potentially 100 s of millions of objects. This map is in five wavelengths of light: ultraviolet (u), green (g), red (r), infrared (i) and the main redshift band (z). These spectral redshifts give a dimension of depth to the otherwise 2-dimensional photometric sky survey [2]. The optical colours of galaxies can also be used to relate the age of the stars contained within them. Redder galaxies tend to have older stars, whereas blue galaxies have younger stars. In practice, we can quantify “colour” in Astronomy as the difference in magnitude in two different bands, for example [3].
![]()
Figure 1. An example of an SDSS map Universe. This image contains millions of galaxies of varying redshifts as well as the all important “gaps” in the image created by dust clouds in the galactic plane [4].
Despite the achievements of the SDSS3D Map of the Universe, there exist two regions of the sky from RA 57-114 degrees and approx. 266 - 301 degrees, which are absent from the data (see Figure 1). These two occultations are produced by the dust and material of our own galaxy, the Milky Way, as it cuts across the sky (SciTech, 2020). In another article on the SDSS blog, the author discovered the following: “The black parts of the pie are where SDSS did not map galaxies, either because our Milky Way is blocking the view from Earth, or because those parts of the Universe are not visible from our telescope in New Mexico.” [5]
This information about the location of the telescope is curious both because it was not mentioned in the [6] video and because there are two telescopes operating on the SDSS mission; the Sloan Foundation telescope in New Mexico; and the one in the Atacama Desert in Chile; the Las Campanas Observatory [6]. It assumed that a portion of the galaxies invisible to the Sloan telescope would be visible in Atacama, and vice versa.
1.2. Advanced Data Analysis
Considering how many images and spectra the SDSS project has produced, it is no wonder that astronomers have had to develop and employ some unique and interesting methods to analyse, categorise and classify the data. In the legacy portion of the project, the SDSS team relied on the general public to classify “10s of thousands or even millions of galaxies”, as either being elliptical or spiral, and through projects such as Galaxy Zoo [2]. Today, astronomers and astrophysics have advanced Data Analysis tools like deep learning Neural Networks to classify such stellar objects. Deep learning tools have been used to variously classify the spectroscopic redshift of quasars [7]; Galaxy classification [8]; as well as estimating the atmospheric parameters of DA White Dwarf Stars [9] and retrieving the Internal Kinematics of Galaxies [10].
In a previous research paper [11], the author attempted to impute the missing galaxies in the SDSS map using Deep Image Prior (DIP) Inpainting [12] [13], polar transformations and Linear Regressor Artificial Neural Networks (ANN). The ANN imputation was specifically focused on the Northern Hemisphere, while the DIP imputed the entire visible universe. That research confirmed the existence of the Great Attractor and the Homogeneous nature of the Universe.
In this paper, the author attempted to impute the Southern Hemisphere with the Linear Regressor ANN and complete the map of the Universe. The results, once again confirm the existence of the Great Attractor, the homogeneity of the Universe, as well as a new development in the Southern Hemisphere, which the author describes as “The Great Repeller”. As the name suggests, this is the counterpart to the Great Attractor and appears to be a region of repulsive force containing relatively few galaxies.
2. Great Attractor and Repeller
The Great Attractor is speculated to be a mass of galaxies millions of times heavier than our own and located at or between the Centaurus and Virgo Clusters, behind the Milky Way’s spiral edge. This massive object so disturbs the homogeneity of space that it creates two massive voids near the Pavo Indus cluster in the Southern Hemisphere. The motion in the direction of this mass is known as “Dark Flow”. While the Great Attractor is conceivably a supermassive blackhole, the Great Repeller could exist as the result of a high concentration of Dark Matter or Dark Energy in that region of space, as these two forms of energy and matter are believed to have repulsive properties. However, whether this is a real force or simply the accumulative results of a unipolar attractive force stemming from the Great Attractor is not immediately clear.
GerdPommerenke [14] has speculated on the existence of a Great Attractor and Repeller in the form of the Event Horizon and Particle Horizon of the Universe. In this model, matter enters into the Universe at the Particle Horizon (Great Repeller) and exits at the Event Horizon (or Great Attractor). This model stipulates a toroidal topology to the Universe, which is essentially a continuous Big Bang model. Elsewhere, Wsol has proposed a similar Toroidal model of the Universe in which he examines images of the CMBR maps to discover the two areas where the Universe emerges out of being and disappears out of view [15]. However, [15] notes this may simply a stitching artefact produced when the image was scanned. The author has not been able to confirm the position of the stitching artefact (and whether or not it corresponds to the location of the Great Repeller), at the present time.
3. The Inhomogeneous Universe
In 1929, Edwin Hubble published his observational findings revealing that the vast majority of stars, galaxies and nebulae were redshifted with respect to the Earth. It was also revealed that the further these objects were from the Earth, the more redshifted they appeared. This curious fact led him to conclude, as several others including Georges LeMaître had done before him, that the Universe was expanding. Even more curiously, it appeared that Earth was at the centre of this Cosmic Expansion, an auspicious/odious position that it had not been privileged to since Medieval Times. Today, with our advanced knowledge of the Big Bang, we know that the Earth does not occupy any privileged position in the Universe and that each position in space is expanding away from every other, like points on the surface of an inflating balloon.
Just as the Big Bang has expanded space, our knowledge of the Big Bang has also been expanded by “Inflation Theory”, which is one the more enriching and bizarre theories of Astro-Physicist Alan Guth. Inflation Theory was developed, in large part, to solve the inconsistencies in the Big Bang model; namely why the Universe was both homogeneous (the same in all places) and isotropic (the same in all directions). The isotropy of the Universe is predicated on the Cosmic Microwave Background radiation (or CMB), which is uniform to 1 part in 100,000 in all directions. In an interview with neuroscientist and investment banker Robert Lawrence Kuhn, Guth explained how the Universe could be inhomogeneous and yet still isotropic “if it were surrounded by shells”, but went on to state that this is “highly implausible” and that “no-one really thinks that”.
When Guth stated “no-one really thinks that”, he was of course talking about his fellow physicist colleagues [17]. However, if we look at the literature, we see that a great many research papers have been written on this subject. In 1987 and 1988, Koo and Kron published observational spectroscopic data obtained via the Cryogenic Camera spectrograph system at Kitt Peak National Observatory, which revealed large-scale fluctuations in the redshift distribution of galaxies [18]. In a letter to Nature dated 1990, from Broadhurst, entitled ‘Large-scale distribution of galaxies at the Galactic poles’ physicists were calling for these results to be verified and for the “implications for cosmology” to be determined [19]. In Astrophys Space Sci, J. G. Harnett spoke of how an “Unknown selection effect simulates redshift periodicity in quasar number counts from Sloan Digital Sky Survey” [20]. In a paper by Hartnett & Hirano, a Fourier analysis, of both the Sloan Digital Sky Survey and the 2dF Galaxy Redshift Survey on the number of galaxies with redshift z data (N-z relation) results in “galaxies preferentially located on co-moving concentric shells with periodic spacings” [21].
All of this data strongly suggests that the Universe is not homogeneous and that large concentric shells of galaxies do surround our own Milky Way.
Figure 2. Quadrupole differential map of the Cosmic Microwave Background Radiation (CMBR) [16].
Figure 3. Vertical alignment of the CMBR dipole with the equinox points and the SEP and NEP [22].
4. Axis of Evil
According to the Standard Model of Cosmology, the Universe should be homogeneous and isotropic. However, we know from observation that the Cosmic Microwave Background is anisotropic. Worse still, Max Tegmark discovered that the CMBR dipole is aligned with the orientation of the plane of the ecliptic and with the Earth’s equinox points. This discovery was originally made by Tegmarkin the application of Gaussian blur filters to images of the CMBR [23] and was confirmed by WMAP and Planck data [24] (see Figure 2). While Guth has explained this anisotropy with respect to the Earth’s motion through the Heavens, it is further discovered that anisotropy of cosmic acceleration in Union2 Type Ia supernova align with CMB dipole [25] [26], which cannot be explained in such terms. The same multipole alignment has also been revealed in the large-scale distribution of spin direction with respect to spiral galaxies [27].
Figure 4. This image has been attributed to Benedetta Ciardi [28], who is project leader at MPA for LOFAR (Max Planck Institute for Astrophysics).
Despite this evidence, there exists popular skepticism in regards this issue, which claims that the alignment with the equinoxes and ecliptic plane is not exact. In personal conversations with skeptics, the author has identified two main complaints; 1) The plane of the ecliptic is amorphous and has a range of values (i.e. not exact), and 2) The CMBR is amorphous and therefore the central point of any one “blob” of temperature is nebulous and inexact. However, from an astronomical point of view, the plane of the ecliptic is well-defined and error bars are a natural part of how science is conducted in this, and similar fields, such as cosmology and quantum mechanics. With regards the second contention, the reader will note, also that the contention does not state that the poles of the CMB are misaligned with the equinox points, but that our methods for determining such are inexact. However, the statistical methods of finding centroids in amorphous blobs, such as these, are well-advanced and well-tested.
Despite these two clarifications, the author would like to steel-man the skeptic’s arguments and suppose that the alignment of the equinoxes differs from the dipole of the CMB. Even if this is the case, we can still find a vertical alignment between the equinox points and the dipoles (the red and blue circles Figure 3) [23]. The reader will note that not only does this reinforce the alignment, but that these two circles also pass through the SEP and NEP (South Ecliptic Pole and North Ecliptic Pole), making the proposed alignment (arguably) even more sustained than previously thought. The author stipulates arguably, because the SEP and NEP are naturally aligned with the equinox points, so an alignment of either one of these with the dipoles in the CMB, should necessarily result in the alignment with the other.
5. Motivation
While all of this research and data is significant and interesting, for the purposes of our research, only the visual structure of the galaxies in the SDSS map is of true concern. It is clear when examining Figure 1 that there are distinct rings of galactic filaments surrounding our own galaxy (the blue dot). The colours of the galaxies change in line with their redshift, which is indicative of their distance from the Earth and in some cases the speed at which they are travelling. Note that in two locations the distribution of galaxies is obscured. This occultation is produced by the dust and material of our own galaxy, the Milky Way, as it cuts across the sky [29]. The missing data can be imputed visually using a variety of methods, but any method that requires a human hand will necessarily exhibit bias in its execution. Such biases can be seen displayed in Figure 4, where the placement and shapes of the galactic filaments appear different from those in Figure 6 (for example) and the extensions into the occluded areas appear unnatural and arbitrary and are presumably a result of the artist’s own prejudices.
![]()
Figure 5. This imputation was created by the author using Adobe Photoshop program by mirroring and layering the image in Figure 1.
Figure 6. A more refined imputation made by eliminating the symmetries generated in Figure 5, by use of the “stamp tool” in Adobe Photoshop.
To eliminate this inherent bias, as much as possible, the author proposes to use the Inpainting Convolutional Neural Networks, called Deep Image Prior [12] [13], to impute the missing galactic data and in so doing discover millions of potential new galaxies. To create the Inpainting imputation, it was first necessary for the author to manually impute the galaxies in Figure 1. The process employed in this manual imputation can be seen in Figure 5 and Figure 6. In Figure 5, the author mirrored the image to fill in the missing information in a way that preserved the apparent homogenous pattern. This, therefore, represents the first bias. In Figure 6, the author is making an effort to remove the symmetric aspect of the data by means of the Photoshop “stamp tool” and thereby produce a more “natural appearance”. The second instance of bias—which is a negative sort of bias—can be seen in the top lefthand corner, where too much of the green data is removed. This is done to make the image conform to some aspects seen in Figure 4 and to therefore appear less symmetric and, as a result, less biased.
Based on the results of the Deep Image Prior Inpainting method, it is confirmed that the bias applied in this instance was negative i.e. the author began to impute data in a way which was actually contrary to the purely statistical results applied in the Inpainting process. This negative bias was intentional employed by the author, so as not to skew the results in so obvious a manner, as might have been employed in Figure 4.
Regardless, the fact is that the Deep Image Prior method eliminated this bias by statistical means and seemingly without reference to the original imputation (Figure 7). Furthermore, this Deep Image Prior imputation was fed into a second iteration, with a more detailed algorithm and produced, more or less, the same results (Figure 8).
Figure 7. Deep Image Prior CNN imputation using the settings for “kate.png” or “peppers.png”.
Figure 8. Deep Image Prior CNN imputation using the settings for “library.png”.
The author also applied a Linear Regression ANN to a polar transformed image of the SDSS3D Map, for both the Northern and Southern Hemispheres. The ANN method is preferred over the manual imputation methods, as it contains no prior assumptions whatsoever and is entirely based upon the statistical imputation methods employed by the Linear Regressor ANN. More importantly, it also confirms that which was found in the the Deep Image Prior imputation method. Revealing both the Great Attractor and Great Repeller in more detail. Before we examine that method, it will be necessary to look at the Deep Image Prior method in more detail, however.
6. Deep Image Prior
In this section, it was necessary to use the Deep Image Prior package to impute the missing data in the png image, in a process known as “inpainting”. According to one interpretation, Convolutional Neural Networks that employ inpainting “hallucinate” the missing image data [30]. Like any Neural Network, it starts off with a haze of random data and then the image slowly comes into focus. Inpainting can be used to restore damaged or missing sections of image files. However, to do this the undamaged image file must be present in the process, as a sort of reference file. In our case, this is difficult since the missing data was never there to begin with. However, we do have Figure 6, which is the the author’s own artistic attempts at imputation. Therefore, we began by loading Figure 1 and Figure 6 into Ulyanov’s “Inpainting Colab notebook” [31].
There are four separate Inpainting algorithms that we can employ, but only two meet our needs. The first is supplied with a sample image called “peppers”. The hyperparameters of this algorithm appear to be tuned for shape and colour. The second is attached to a “library.png” sample image and appears to be more focused on intricate detail. The results can be seen in Figure 7 and agrees that the morphology of the Universe is most likely comprised of large shells or rings of galaxies all centred on our own Milky Way. It speculatively reveals the existence of the Great Attractor and there also appears to be confirmation of the The Great Repeller, in this image, which the author had not considered before.
However, there is reason to be doubtful of the validity of the result in this case, as Deep Image Prior (DIP) becomes less and less accurate as the space it needs to fill becomes larger and larger. This is because DIP is only able to make predictions based on adjacent pixels. The fewer adjacent pixels the more challenging it is for the model to predict [30]. However, in this case, it would appear that the algorithm is reasonably accurate up to 0.1046 polar gridline (see Figure 1 for correct units). After this point, it is apparent that the accuracy drops significantly with the algorithm being incapable of even correctly inpainting the polar gridlines, let alone the more complex galaxies.
The Inpainting imputation was attempted a second time using the library.png algorithm. Here, Figure 1 was the input and Figure 7 the reference image. The end result is noisy but produces the same bulk flow in the top left corner (i.e. the Great Attractor), reaffirms the concentric shell inhomogeneity, as well as indicating the existence of the Great Repeller (Figure 8).
7. Neural Network Imputation
Figure 9 was created using galaxy information with the SpecObj data on bestojid and objid. It also focused on a petror90_r of greater than 10 degrees and a g index of less than 17. This reproduces the characteristic, so-called “pie-diagram” seen in Figure 1. The purpose of mapping this from the primary SDSS data was to create as accurate a copy of the galactic distribution as possible and thereby facilitate the exploration and manipulation of the data. Using the polar coordinates warp function from the scikit image library [32], the can be easily unwrapped. This puts it in the correct format to be fed into a linear regression ANN. A rolling window function can then be used to impute the missing data in both the Northern and Southern Hemispheres (see Figure 10). This also provides the RA and adjusted luminosity for each of the imputed galaxies. Note however, it does not contain any redshift information, as it is imputed from image and not tabular data. Once again, the Great Repeller is a noticeable feature, in Figure 10.
In a future paper, the author would like to reattempt this imputation using tabular data, but even so there is no guarantee that this imputed redshift data would be accurate or meaningful. This is because, the ANN algorithm is stochastic in nature and, as such, produces slightly different results each time. This means that a galaxy that was inposition “A” during one imputation is now in position “B”, or is entirely absent. Be that as it may, the results may be useful in determining the large-scale structures, in the imputed regions.
Figure 9. SDSS map of the Universe; blue area (0.02 - 0.03); green (0.03 - 0.04); yellow (0.04 - 0.05); red (0.05, 0.06); black (0.06 - 0.2) redshift.
Figure 10. Polar-transformation and imputation of the original data, as seen in Figure 9. This image contains information about the Northern and Southern Hemispheres.
Over numerous instances of the ANN algorithm, persistent structures were visible. These structures appear in Figure 11. At the location of the green oval, we see a void creeping into the right hand side of the imputed region. Opposite this, there is a sweeping line. This line appears to be the wake of the Dark Flow, piercing into the heart of the Great Attractor. Structures within the Great Attractor, itself, are indicated by the purple oval, where there appears to be two dark, concentrated patches. In other instances (not shown), these patches were much more clearly spiral in shape. A spiral vortex at the location of the Great Attractor might be expected if it was the location of a supermassive galaxy and/or supermassive black hole. The fact that two such dark spots are visible in this image indicates that there might be two of these supermassive black holes and hence two Great Attractors.
This idea that there are two blackhole vortices vying for dominance side by side one another on the far side of the Milky Way, and of the Universe, probably has its origins in the discussions surrounding twin vortices at the Venusian South Pole. If correct, then this may indicate that this is the South Pole of the Universe, as opposed to the North Pole, i.e. in the “Northern Hemisphere” of the Universe.
Figure 11. Large scale structures in the Great Attractor. Green oval; void. Green line; sweeping void. Purple oval: Twin blackhole vortices and/or galaxy concentrations.
Figure 12. SDSS imputation using a batch size of 32 and 24 training epochs.
The model has an input dimension of 500, with the same number of input units in the first layer. This layer has rectified linear unit activation, as recommended. The hidden layer has 1600 nodes. The output layer has one unit and a linear activation function. The optimizer used was “adam”, and the loss function is mean absolute error (“mae”). The model had a test size of 0.25. A KerasRegression function was used with a batch size of 25 and 5 training epochs. While this is admittedly a low batch size, the author noted the results were general consistent with larger batch sizes and training epochs. For example, Figure 12 shows results using a batch size of 32 and 24 training epochs. Experiments were also run on a batch size of 64 and showed similar results. A low batch size was initially used as the training tends to converge early, after one or two epochs.
8. Conclusions
The research in this paper follows on from previous research [11], where the author attempted to impute the missing galaxies in the SDSS map using Deep Image Prior (DIP) Inpainting [12] [13], polar transformations and Linear Regressor Artificial Neural Networks (ANN). In that case, the ANN imputation was specifically focused on the Northern Hemisphere, while the DIP imputed the entire visible universe. That research confirmed the existence of the Great Attractor and the Homogeneous nature of the Universe. In this paper, the author has attempted to impute the Southern Hemisphere with the Linear Regressor ANN and complete the map of the Universe. The results, once again confirm the existence of the Great Attractor, the homogeneity of the Universe, as well as a new development in the Southern Hemisphere, which the author describes as “The Great Repeller”. As the name suggests, this is the counterpart to the Great Attractor and appears to be a region of repulsive force containing relatively few galaxies.
A combination of Convolution Neural Networks and ANNs has allowed us to impute much of the galactic data hidden from view by the “galactic plane” and potentially reconfirmed the size of the Great Attractor, the existence of the Great Repeller, and structures within them. The exact details of the imputation (such as RA, and adjusted luminosity) are stochastic in nature and are therefore not reliable. However, based on this imputation, it is possible to do an analysis of the large scale structures in the Universe. From this perspective new voids are discovered, along with a sweeping arc like void, as well as two new dense regions, which may be super-black holes. There is also the void of the Great Repeller itself, which is another structure.
In future research, the author would like to use purely tabular data to conduct the imputation and thereby obtain exact values for RA, redshift and galaxy count numbers. These numbers could be useful in determining the exact distribution of stars in the entire Visible Universe and may have further application in subsequent geophysical and astronomical research.
Data Availability
The data is available from SciServer; https://apps.sciserver.org/compute/. SciServer is a relational database cloud computing platform built upon the Microsoft SQL Server Database [33]. SciServer permits data storage and querying of the SDSS database via Jupyter notebooks and provides storage space (MyDB). To access SciServer Compute, it is necessary to click on the menu at top righthand corner near the username field and select “Compute” and then select ‘Create Container’. To recreate the graphs and data in this paper, query from the latest data release “DR17” and join the galaxy information with the SpecObj data on bestojid and objid, use a petror90_r of greater than 10 degrees and a g index of less than 17.