An Efficient and Anonymous Multidimensional Data Aggregation Scheme Based on Fog Computing for Smart Grids ()
1. Introduction
As an advanced power system, the smart grid (SG) makes use of contemporary information and communication technology [1] [2] to facilitate real-time data exchange and communication, increasing efficiency for better energy management. By integrating traditional power grid infrastructures with advanced technologies such as cloud computing [3], fog computing [4] and mobile edge computing [5], the smart grid enables real-time data collection, transmission, and processing. This integration facilitates communication in both directions between companies and consumers, improving the efficiency, affordability, and stability of power management. A fundamental element of the SG is the smart meters (SMs) [6], which are positioned at the consumer’s location to gather real-time electricity usage data and periodically report the data to the control center (CC). SMs are considered the most critical element on the consumer side, since they allow users to report their real-time energy usage actively. Based on the data, various applications are designed to enable the CC to predict power demand, adjust power generation, implement dynamic pricing strategies, and optimize overall grid management, etc. In order to give other organizations relevant data for commercial advertising or decision-making in the energy sector, CC could potentially carry out more statistical research [7]. Despite these advantages, the smart grid faces various security and privacy challenges. For example, by analyzing real-time power consumption data, malicious actors could potentially uncover users’ habits and activities, leading to leak privacy. Besides, the aggregated data must be correct, even if certain SMs do not submit data of their own. Consequently, it is essential to adopt robust encryption technologies to safeguard user privacy.
Privacy-preserving data aggregation [8] has been extensively recognized as an effective approach for protecting data security in smart grids. To do this, SMs can employ homomorphic encryption (HE), enabling them to encrypt electrical usage data through Paillier Homomorphic Encryption [9]. This allows the aggregator gateway (AG) to safely aggregate the data that is encrypted while ensuring that the CC receives the total from SMs in a manner that preserves privacy [10]-[12]. Earlier aggregation methods primarily employed one-dimensional data aggregation, which was effective but lacked the fine-grained data required for a more detailed energy consumption analysis. Recent schemes have introduced multidimensional data aggregation techniques in smart grids, offering significant improvements over one-dimensional methods. Multidimensional aggregation enables the collection of electricity consumption data across various appliance categories, such as wall lamp, refrigerators, air conditioners, and dryers. The more fine-grained the data, the more effective the subsequent analysis becomes. However, multidimensional aggregation often involves complex cryptographic operations to ensure data encryption and decryption, adding to the computational complexity of the process. Traditional aggregation schemes rely on low-performance intermediate nodes that are often connected to multiple end nodes, limiting their ability to efficiently handle such cryptographic operations. To address this challenge, fog node (FN) [13] has been introduced in smart grids to enhance communication, computation, and storage capabilities, optimizing data aggregation. This transition to fog computing enhances the efficiency and scalability of data aggregation procedures.
As the data dimensions increase, the communication costs grow. A critical challenge is to minimize communication costs while maintaining user privacy. Previous aggregation schemes have employed masking technology [14] [15] to mask individual users’ electricity consumption data, which reduces computational and communication overhead. However, these schemes fall short in terms of fault tolerance, as they require retransmission of billing reports in the event of node failures. Fault tolerance is essential for ensuring the robustness of data aggregation in smart grids. When SMs temporarily fail, preventing them from reporting data to FNs, fault tolerance mechanisms can help maintain data integrity. Some mechanisms have been suggested to achieve fault tolerance in response to the challenges presented by faulty SMs. User data is encrypted via a modified variant of the Paillier cryptosystem [16]. However, as the quantity of faulty SMs increases, the decryption expense raises substantially. Similarly, the framework suggested by Boudia et al. [17] incorporates a fault-tolerant method to guarantee the accuracy of the final aggregated data retrieved at the CC, even though the failure of certain nodes. This scheme employs a masking technique to protect individual user data, where the masks are designed to cancel each other out during aggregation. However, this approach requires the FN to know the real identities of the SMs, which introduces privacy concerns.
To solve the above problems, we propose EAMA, an efficiency and anonymous multidimensional data aggregation scheme based on fog computing for smart grids. This paper’s primary achievements can be succinctly stated as below.
1) We have improved the Paillier cryptosystem by designing an encoding function that encodes multiple data items into a single ciphertext. Using this encoding function, the CC successfully obtains the aggregated data in the ciphertext. Additionally, the most computationally resource-intensive operation in Paillier encryption is the higher-order exponentiation of
, which we optimize to effectively reduce the computational overhead.
2) Since the data aggregator, the FN, is honest-and-curious, so we propose a bilinear pair-based batch anonymization authentication algorithm to efficiently verify users’ anonymity and data integrity.
3) We offer a fault-tolerant approach for ensuring the accurate recovery of the final aggregated data at the CC. When an SM fails, our mechanism solves the problem of blind factor elimination without requiring the real ID of the faulty SM.
The subsequent sections of this article are structured as follows. Section II introduces related work. Section III introduces preliminary knowledge. Section IV introduces the system design of the EAMA, followed by Section V, which introduces the proposed scheme. Section VI introduces the system analysis of the EAMA, and Section VII discusses performance evaluation. Finally, Section VIII concludes the paper.
2. Related Work
In recent years, various data aggregation methods have been introduced for smart grids. Among these, the work presented in [18] introduced a secure in-network aggregation approach specifically designed for smart grid environments. This approach leverages HE to ensure that users’ private data remain undisclosed to intermediate aggregator nodes. As smart grids evolve, there is an increasing necessity for CC to obtain detailed data for enhanced services and to optimize demand response strategies. At the same time, it is crucial to preserve the privacy of individual users. HE emerged as a particularly advantageous approach for encrypted data aggregation owing to its homomorphic characteristics, which allow CC to execute statistical analyses effectively on encrypted data, thereby preserving the secrecy of user information.
In 2012, Lu et al. [19] introduced a multidimensional data aggregation framework utilizing the Paillier HE encryption method. Their technique employed a super-increasing sequence to facilitate CC’s computation of the sum of diverse power usage data kinds. However, super-increasing sequences become less efficient for data packing when the dimensionality is high. To tackle this issue, Li et al. [20] introduced a multi-subset data aggregation strategy utilizing the Paillier homomorphic encryption algorithm, incorporating two super-increasing sequences. This design allowed the aggregation of data from different ranges, enabling CC to obtain more granular insights, such as the sum of consumption within specific regions and the consumer’s count in each range. Despite these advancements, as noted in [21], the scheme restricts each subset of SMs to predefined data ranges, limiting the flexibility and utility of the aggregated data. Boudia et al. [22] proposed a secure multidimensional data aggregation scheme based on elliptic curves and utilizing ElGamal homomorphic encryption (HE) with multiple public keys. Their approach avoided complex encryption operations during data transmission, reducing both computational and communication overhead. However, the scheme required additional elliptic curve scalar multiplication operations for each dimension via the security module, which imposed a significant computational burden. In contrast, Zuo et al. [23] employed a super-incremental sequence combined with the ElGamal HE technique for multidimensional data aggregation. Their technique devised two categories of super-increasing sequences: one for computing the aggregate electrical usage of identical types among all users, and the other for verifying the total amount of individuals whose electrical usage resides inside specified intervals.
The data aggregation schemes proposed in [14] [15], and [24] employ masking techniques to conceal individual users’ electricity usage data. By aggregating all masks, the values cancel each other out, effectively preserving user privacy while providing the overall power usage data. These schemes are notable for their low computation and communication overhead. However, they lack fault tolerance, necessitating additional billing reports to be transmitted in the event of system failures. Zhang et al. [25] established a privacy-preserving billing mechanism that integrates ElGamal’s multidimensional data aggregation algorithm with an anonymous user identity design. This method efficiently safeguards against collusion attacks perpetrated by any two individuals within the system. Additionally, the ciphertexts generated by ElGamal encryption are typically smaller than those produced by Paillier encryption, making ElGamal more efficient in terms of ciphertext storage for multidimensional data. Boudia et al. [17] employed the HE approach into a fog computing architecture to encode multidimensional data. For efficient authentication, their scheme employed a batch authentication technique. Although their method, known as ESMA, is fault tolerant, it requires blind factorial updates for all functioning SMs.
To address these limitations, we introduced encoding functionality into EAMA, utilizing the Paillier homomorphic encryption (HE) technique to efficiently construct and secure multidimensional data. EAMA employs a pseudonymization technique, enabling the CC to read and process aggregated data without requiring blind factor updates for all functioning SMs, even in cases where some SMs are non-functional. This ensures that the CC can continue processing aggregated reports, maintaining fault tolerance and operational efficiency.
3. Preliminaries
3.1. Bilinear Pairing Maps
A bilinear pairing map
is defined on the elliptic curve
on the finite field
, where
is a large prime. In this context,
denotes an additive cyclic group of order
derived from the elliptic curve
, while
represents a multiplicative cyclic group of identical order
. The bilinear pairing map exhibits the following properties:
1) Bilinearity: For every
and
, it holds that
.
2) Non-degeneracy: There are two elements
such that
, where 1 denotes the identity element of
.
3) Computability: For any
, there exists an efficient algorithm to compute
.
3.2. Optimization of the Paillier Cryptosystem for Higher Order
Polynomial Operations
The Paillier cryptosystem [9], after optimizing the power operation
in its original encryption process, encounters its most resource-intensive computation in the evaluation of the higher-order power function
. In 2010, Ivan Damgård [26] proposed an optimization strategy to streamline the calculation of
, demonstrating that this enhancement preserves the security level of the original Paillier algorithm. Therefore, the security of the modified Paillier encryption algorithm will be analyzed by imitating the original Paillier encryption algorithm.
1) Key Generation: To ensure security, it is required that
, and
.
Compute
. Select a random number
such that
, and calculate
. Choose a natural number
. In the original Paillier scheme, this corresponds to setting
. Compute
. The optimized public key is defined as:
and the private key is:
.
2) Encryption: Generate a random number
, where
, and
is the key length. The optimized encryption formula is:
. By choosing
, the computational efficiency of
is significantly improved compared to
in the original encryption process.
3) Decryption: To decrypt the ciphertext
, the plaintext
is recovered using the equation:
, where
is the decryption function used in the Paillier cryptosystem.
4. System Design
4.1. System Model
Our proposed method emphasizes the secure aggregation of sensor data within a Fog computing-based Smart Grid system while ensuring the privacy of sensitive information. Our system model involves four participating entities: a group of SMs,
located at the network edge; FNs situated near smart devices; a Remote CC; and a Key Generation Center (KGC), as depicted in Figure 1.
Figure 1. System model.
1) SMs: SMs, denoted as
, are terminal devices within the Internet of Things (IoT) network. The meters in question possess sensors, computational parts, to gather real-time data from their environment. The collected data is periodically reported to the nearest FN for further processing.
2) FNs: FNs are positioned nearby smart devices to enable communication between
and the CC. FN offers critical services, including data validation, the process, and storage for the CC. In our suggested method, the FN acquires data from
within its jurisdiction, authenticates and consolidates the data, and subsequently transmits the aggregated ciphertext to the CC for additional examination.
3) CC: The CC is tasked with producing system parameters during initialization and analyzing the aggregated data uploaded by the FN. The CC is crucial in overseeing and administering the system’s overall functionality, utilizing the consolidated data obtained from the FNs.
4) KGC: The Key Generation Center (KGC) functions as a reliable intermediary in the cryptosystem, tasked with distributing public system parameters and private keys for SMs and CC, while also producing pseudonyms for each SM to safeguard user identity privacy.
4.2. Design Goals
Our objective is to introduce an effective and anonymous privacy-preserving multidimensional data aggregation scheme for SG based on Fog Computing. The objective is to safeguard critical information transmitted by smart terminal devices while maintaining user privacy. The subsequent objectives must be accomplished:
1) Security: The proposal must ensure the secrecy, integrity, and authenticity of data transition among system entities. Outside adversaries (
) must be denied access to decrypted data. Any modifications to messages should be detectable, and the authentication of transmitting data, along with the verification of legal entity identities, should be performed by FNs and the CC.
2) Privacy Protection: User privacy is paramount in SG. Decryption keys are known only to the CC. Ciphertext computations are performed exclusively by the FN. KGC is aware of users’ real identities, ensuring isolation between keys, ciphertext, and identity. This isolation guarantees that total power consumption can be obtained without revealing user privacy. Individual privacy in IoT applications is crucial. In this proposal, individual privacy is an unrevealed secret to
. Even if there is collusion between the FN and CC, individual privacy is protected when user
employs anonymous privacy encryption.
3) Fault Tolerance: In Smart Grids,
may experience communication failures. The proposal must include fault tolerance mechanisms to ensure correct aggregation and decryption in the event of certain
failures.
4) Performance: Performance is crucial for both SMs and FNs to meet the practical demands of data aggregation in Smart Grids. This requires keeping computation and communication costs as low as possible to ensure system efficiency. Performance is essential for data aggregation in extensive SG, facilitating the prompt and efficient processing of significant data volumes.
5. Proposed Scheme
In this section, we present EAMA for Multidimensional Data Aggregation. Table 1 presents a compilation of acronyms and symbols utilized in this article, accompanied by their definitions. The scheme consists of five main components:
Table 1. Notations.
Notation |
Definition |
CC |
The CC |
FN |
The FN |
SM |
The SM |
|
The modulus defined as
|
|
The generator of the group
|
|
An additive group |
|
A multiplicative group |
|
The public key pair |
|
The private key pair |
|
A bilinear pairing |
|
A secure hash function,
|
|
A secure hash function
|
|
The number of SM covered by
|
|
The secret share of
|
|
The secret key of
|
|
The public key of
|
|
The data type
of
|
|
The encoded representation of
|
|
The total count of data categories |
|
The upper limit for a data category’s value |
5.1. System Initialization
Based on the security parameter
, KGC is required to generate two
-bit prime numbers,
and
, which satisfy the conditions:
,
. The KGC subsequently chooses two large
-bit primes,
and
, and calculates the public key of the Paillier encryption system as
,
. The associated private key is expressed as. Subsequently, the
KGC constructs a function
and computes
as
A random number
is then chosen, where
, and
. A natural number
is chosen, and for the original Paillier system,
. For
, we have
. The public key is denoted as the tuple
, whereas the associated private key is represented by the tuple
. The KGC defines a bilinear pairing map
, where
and
are two multiplicative cyclic groups of identical order
, and
serves as the generator of
. The KGC further establishes four collision-resistant hash functions:
and
.
The KGC establishes the upper limit of FNs at
, and defines
as the indexed collection of operational SMs. A pseudo-random number producer generates
secret shares
, where
, and
. The KGC then computes
:
(1)
5.2. Registration
1)
Registration and Pseudonym Selection: Each SM
produces its unique identity
, randomly chooses a private key
, and derives its public key as
, with
representing the group’s generator. The pseudonym
is computed as:
The signature
is then calculated as:
The SM
sends
to KGC along with the timestamp
.
Upon receiving the registration request, the KGC first verifies the signature:
If the signature is valid, the KGC generates a list of pseudonyms
based on the real identity
. The pseudonym list corresponding to
of
is as follows:
.
During the
-th anonymous update cycle, the KGC selects a random number
(where
) to modify the pseudo-identity of
. The updated pseudonym is computed as:
The new signature
is calculated as:
(2)
The KGC verifies the following equation to ensure consistency:
(3)
If the formula is satisfied, the tuple
is recorded. After successful verification and registration, the KGC sends
to the FN and CC for further processing.
2) FN Registration: FN is registered, randomly selected
as the private key, and computed the public key
, where
sent to KGC.
3) CC Registration: CC randomly selects
as the private key and computes the public key
. Then, CC sends
to KGC.
At the end of this phase, KGC publishes the system parameters as
5.3. Report Generation
During the time period
, each
encrypts the
-dimensional data
and simultaneously creates the corresponding signature in Figure 2.
Figure 2. Data operation flowchart.
Step 1:
encodes
into the binary string
where each of the
and assigns its private information as
.
. Then encode,
to compute
.
(4)
Step 2: In the preceding Paillier encryption method,
selects a random number
and calculates the encrypted content as
(5)
In our Paillier cryptosystem, randomly generate
, where k is the key length
(6)
Step 3:
Utilize the private key
to generate the signature as described below.
where TS means the present timestamp.
Step 4:
transmits the verified data report information
to the corresponding
.
5.4. Data Aggregation
1) All-Inclusive Data Aggregation: During the period
, we posit the occurrence of certain SM failures within the SG, and use
to denote the index set of working SMs. Let
describe the quantity of operational SMs under
, and let
signify the minimum threshold of the effective sample size. When
, verified data report messages (e.g.,
) are received from legitimate SMs,
conducts a batch validation by verifying the subsequent calculation:
(
The
will initially conduct bulk verification to authenticate the received signature by determining if the subsequent equation is satisfied:
(7)
(8)
Batch verification decreases the quantity of pairing processes from
to
. Upon validation,
aggregates all encrypted data and transmits the aggregated data to CC. Execute the subsequent steps.
Step 1:
aggregates
encrypted ciphertexts into
(9)
Step 2: The function
generates the signature utilizing its private key
in the following manner.
(10)
where TS denotes the recent timestamp.
Step 3:
sends a full report to CC containing
(11)
If some SMs break down,
will not receive the corresponding packets.
2) Fault-Tolerant Data Aggregation: When certain SMs are unable to transmitting data,
will not obtain the relevant packets. Let
denote the collection of all operational SM units in
, and let
represent the subset of defective SM devices (
). Consequently, we get
(12)
Consequently, this phenomenon will directly influence the accuracy of the ultimate decryption result. The entity
must transmit the set
to the KGC. Subsequently, the KGC generates a new
based on
and transmits it to the CC.
5.5. Data Reading
1) All-Inclusive Data reading: Upon obtaining a whole report from
, CC initially authenticates the signature in accordance with the subsequent equation:
(13)
When the equation is satisfied, it indicates that the signature is legitimate. Upon verifying the authenticity, CC decrypts the aggregated ciphertext
and extracts the aggregated data by executing the subsequent phases:
Step 1: CC decrypts the aggregated ciphertext by first retrieving the encrypted data
. From Equation (8), take:
(14)
The report
is still the ciphertext of the Paillier encryption system. It can be equated to
with
and CC uses the tuple
to recover
: as
(15)
Step 2: After decryption, CC uses
to obtain
as follows:
(16)
Step 3: CC Retrieve each aggregated data
,
using the decoding function.
The CC divides the binary representation of
into
bit chunks of maximal length less than
, so that the aggregated data
can be written as
(17)
2) Fault-Tolerant Data Reading: Upon receipt of the fault-tolerant report from
, CC initially authenticates the signature in accordance with the subsequent equation:
(13)
However, the released
(
) contains only a subset of the
values, and thus we cannot use
to eliminate the blind factor. Consequently, we utilize the
generated by the KGC to obtain
.
(18)
CC Retrieve each aggregated data
,
using the decoding function.
The CC divides the binary representation of
into
bit chunks of maximal length less than
, so that the aggregated data
can be written as
(19)
Thus, we can convert the aggregated data into a binary bit string, and then separate out the substrings of length
bits starting from the lower bit to retrieve the aggregated data for each dimension.
6. System Analysis
In EAMA, we presume that FNs and CC are honest and curious. Each FN precisely compiles encrypted multidimensional electrical usage data from the identical grid zone, while the CC effectively checks and decrypts the data that is provided. Both entities are interested in users’ power consumption trends and may seek to extract useful information. Furthermore, an internal attacker (e.g., a curious user) may attempt to exploit the secret key of the SM and other essential factors to get the original data from SMs located in other users’ residences.
1) Anonymity: During data transmission, the SM communicates with the FN via a pseudonym. During the information transmission session, SMs establish communication links with the FN utilizing temporary identity. As the parameter
employs a randomization technique in the identity generation process, there exists no discernible association between several temporary identifications produced by the same device. Only KGC has the capability to restore the user’s true identity by analyzing the temporary IDs. To augment security and resist attackers from correlating high-precision electricity consumption data with user identities, the system periodically updates and manages the temporary identifiers of all SMs. This technique efficiently safeguards user identification information, hence achieving the anonymity objective of the scheme design.
2) Confidentiality: An outside attacker is presumed to possess the capability to eavesdrop on the public communication channel between the SM and the FN, thereby intercepting the encrypted message
. Nonetheless, due to the semantic security afforded by the Paillier encryption method, an adversary is unable to extract any meaningful information from the ciphertext, even if it is acquired. During the encrypted data reporting phase, each
first encodes the
-dimensional data
into a binary string
, where each
, for
. The private data is then set as
. Under the modified Paillier encryption scheme, the corresponding ciphertext is generated as:
where
. Each
holds a unique secret parameter
, which varies for each instance. Consequently, EAMA is resistant to selective plaintext attacks.
3) Privacy: In this framework, FN retains only the encrypted data of anonymous users without possessing the decryption key. CC possesses the decryption key but cannot access the fine encrypted information or the users’ true identities; and KGC, while aware of the users’ true identities, cannot acquire their detailed data. Even if FN and CC collude, CC can decrypt the encrypted data of individual users acquired from FN. Nonetheless, it is challenging to erase the secret parameter
, preventing the identification of the actual users behind the electricity consumption data. Consequently, despite the potential for cooperation between two parties, the method can nevertheless proficiently obscure real-time electricity usage data from being linked to the user’s true identity, so safeguarding personal privacy.
4) Leak Toughness: EAMA ensures that a group of collaborating users does not threaten the privacy of other users. When an attacker
seeks to compromise a user’s privacy, they must have access to personal data and the associated secret share
. In EAMA, the secret shares
produced by the KGC are allocated independently, indicating that the compromise of secret shares from a subset of users does not disclose those of other users. Assume that
successfully compromises
users associated with a certain
and acquires their secret shares
. For these
users, Equation (1) can be reformulated as:
(20)
Despite this, the data privacy of the remaining users remains intact, as
does not possess the CC’s secret share of the sum
nor the Paillier private key. Therefore,
can’t compromise
. Consequently, with EAMA,
cannot access the private data of other users, irrespective of the number of hacked users. Even if the Paillier private key were to be obtained through logarithmic computation,
would still be unable to retrieve individual data contents due to the secure embedding of the secret shared value
within each ciphertext. Thus, in EAMA, encrypting power consumption data not only ensures leakage resilience but also preserves data confidentiality and the privacy of all SMs.
7. Performance Evaluation
Utilizing established cryptographic libraries, MIRACL [27] and PBC [28], we evaluate the expense of the cryptographic operations employed in EAMA and prior technologies. Our technique is predicated on the modified Paillier cryptosystem utilizing a 1024-bit key and a pairing operation within a base field of 160 bits. We utilize the standard curve secp160r1 for ECC.
Table 2 presents the findings acquired on a computer equipped with an Intel i7-8750H CPU operating at 2.2 GHz and 8 GB of RAM. It is important to recognize that multiplication in
, modular addition, and hash operations are insignificant in comparison to exponentiation in
and pairing operations.
Table 2. Time consumption for related operations.
Symbol |
Definition |
Time (ms) |
|
Time for general hash function |
0.01 |
|
Time for hashing to ECC point multiplication |
0.02 |
|
Time for modular exponentiation |
7.13 |
|
Time for regular multiplication |
1.58 |
|
Time for ECC point multiplication |
0.32 |
|
Time for ECC point addition |
0.34 |
|
Time for modular inversion |
8.72 |
|
Time for bilinear pairing operation |
16.81 |
7.1. Computation Cost
In EAMA, when
produces its output, it necessitates
to generate
and
to produce
. Consequently, the whole computational expense on the SM side is represented by
. Our approach considerably decreases the computing expense for the user. (
represents multidimensional data.)
In data aggregation, upon receiving
from all the SMs,
performs
bilinear pairwise operations and one ECC hash-to-dot for data verification. Subsequently, it consolidates the encrypted data and produces a signature, requiring
conventional multiplication operations to form the aggregated ciphertext, one ECC hash-to-point operation, and one ECC pointwise multiply operation to create a new signature on the aggregated the encrypted data. The overall computational expense of FN is expressed as
. Within the Fog computing paradigm, Fog Nodes has greater processing capabilities than traditional nodes. Consequently, the aggregate procedure may be executed efficiently. Subsequently, we evaluate the computational expense of the current schemes with regard to each SM and AG individually. In the approach proposed by Zhang et al. [25], offer a data reporting method where in the computation is expressed as
, each SM requires
multiplication operations, in addition to two exponentiation operations, one hash operation, and one multiplication to generate
. Generating the authenticator
requires two exponentiation operations, two hash operations, and one multiplication. Thus, the total computation costs on the SM side total
. In Data Aggregation, after AG receives
, from all SMs, it requires
multiplication operations to generate the aggregated ciphertext CT. It requires one exponentiation, one hash operation to compute
,
multiplication operations to generate
, and
exponentiation operations coupled with
multiplication operations to get
. The total computational costs of AG are represented as
. Using a comparable analytical method, Zuo et al.’s framework [23] indicates that the overall computational expense for each SM component is
, while the total computational expense for the AG component is
. Ultimately, the overall computational expense for each SM and aggregator in Boudia et al.’s framework [17] is
and
, respectively.
A comparison of the computational overhead in terms of SMs is presented in Figure 3.
7.2. Communication Cost
Given that SMs are equipped with resource-constrained storage and computational devices, each SM encrypts
categories of data and transmits them to the respective aggregator. The communication cost can be categorized into two components: communication from the SM to the FN and communication from the FN to the CC. Given that SMs are equipped with resource-constrained storage and computational equipment, we concentrate on assessing the communication
Figure 3. Computational overhead on the SM side.
expenses from SMs to FN. In EAMA, the variables,
are sent from the SM to
, where
and
. To provide uniformity, we designate the size of the
and timestamp as 160 bits and 64 bits, respectively, across all methods. Consequently, the transmission expense from SM to FN amounts to
bits. Subsequently, we examine the correspondence from FN to CC. In EAMA, the elements,
are transmitted from
to CC, where
and
. The communication cost from FN to CC is calculated as
bits. In the system proposed by Zhang et al. [25], each
transmits
to AG. Consequently, the transmission expense from
SMs to AG amounts to
bits. In the approach proposed by Zuo et al. [23], each
transmits
to AG. The transmission expense from
SMs to AG is
bits. In the framework suggested by Boudia et al. [17],
is transmitted from
to the respective FN, where
represents the ciphertext of the Paillier encryption system and
denotes a signature, resulting in a total communication cost of
bits. Ultimately, the comparison of communication costs is illustrated in Figure 4.
To evaluate our enhanced scheme against current ones, we established
as the real value and performed a comprehensive comparison of the communication costs from
SMs to the FN, as illustrated in Figure 4. The figure indicates that the communication cost of our design is inferior to that of other schemes, with the exception of the scheme proposed by Boudia et al. [17]. While the transmission cost from SM to FN in the strategy proposed by Boudia et al. is somewhat cheaper than ours, their scheme lacks pseudonyms, potentially compromising its security.
Figure 4. Communication overhead on the SM side.
8. Conclusion
In this paper, we propose an efficient and anonymized multidimensional data aggregation scheme for fog computing smart grids, named EAMA. By leveraging an improved Paillier encryption scheme, our approach enhances encryption performance. Furthermore, the scheme incorporates a pseudonym mechanism for SMs, enabling FNs to aggregate data based on pseudonyms before retrieving the final aggregated results. The security analysis concludes that the approach guarantees data privacy, confidentiality, integrity, and authentication. The performance analysis underscores the scalability benefits of EAMA, the efficacy of its fault-tolerance mechanism, and its cost-efficiency in both computing and communication. Furthermore, EAMA supports inquiries beyond mere summing, rendering it appropriate for the application needs of smart city smart grids.
Funding
The work was supported by the National Natural Science Foundation of China under Grant 12061027, by the Natural Science Foundation of Guangxi of China under Grant 2018GXNSFBA281019, by the Doctoral Research Foundation of Guilin University of Technology under Grant GUTQDJJ2018033, and by the Opening Fund of Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education under Grant CRKL210206.